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imization of  equivalent  energy  functions.    When  discontinuity  detection  is  involved 
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in  the  formulation,  the  energy  functions  to  be  minimized  become  nonconvex.  In  this 
thesis,  we  propose  efficient  computational  algorithms  for  minimizing  these  energy 
functions  with  and  without  discontinuity  detection. 

This  thesis  contains  the  following  four  main  contributions: 

1.  A  new  theoretical  and  algorithmic  framework  based  on  the  capacitance  matrix 
method  for  solving  elliptic  partial  differential  equations  arising  in  early  vision 
and  engineering  problems. 

2.  A  novel  adaptive  preconditioning  algorithm  using  a  wavelet  transform  for  solv- 
ing ill-conditioned  large  sparse  linear  systems  arising  in  regularized  solutions  of 
ill-posed  problems  in  early  vision  and  other  domains. 

3.  Two  new,  robust  and  efficient  algorithms  for  accurate  optical  flow  computation 
namely,  a  modified  gradient-based  formulation  and  an  SSD-based  regularization 
formulation. 

4.  A  new  hybrid  (stochastic  +  deterministic)  search  algorithm  involving  an  in- 
formed genetic  algorithm  (GA)  and  either  one  of  the  two  aforementioned  fast 
numerical  algorithms,  for  solving  nonconvex  optimization  problems  in  early 
vision. 

The  efficiency  of  the  proposed  algorithms  are  demonstrated  through  experi- 
ments on  several  early  vision  problems,  including  surface  reconstruction,  shape  from 
shading,  and  optical  flow  computation.  In  addition  to  computer  vision  applications, 
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the  theory  and  algorithms  developed  in  this  thesis  can  also  be  applied  to  other  ar- 
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CHAPTER  1 
INTRODUCTION 


Early  vision  has  been  one  of  the  most  fundamental  research  topic  in  computer 
vision.  The  primary  goal  of  early  vision  is  to  recover  the  geometric  or  physical 
properties  of  three-dimensional  objects  from  two-dimensional  sensed  data.  Typical 
properties  of  objects  are  the  shape  (  depth  or  orientation  ),  texture,  color,  reflectance 
and  motion  parameters.  Example  tasks  of  early  vision  include  surface  reconstruction, 
optical  flow  computation,  shape  from  shading,  stereo  matching,  edge  detection  and 
lightness  problem  [28,  41,  40,  38,  63,  6,  72].  The  results  of  early  vision  can  be  used 
for  higher  level  tasks  such  as  bin  picking,  object  recognition,  passive  navigation  and 
motion  tracking. 

Early  vision  problems  can  be  regarded  as  inverse  problems.  During  the  imaging 
process,  some  information  is  lost  since  it  projects  three-dimensional  objects  to  two- 
dimensional  images.  Therefore,  it  is  usually  not  possible  to  solve  the  inverse  problems 
from  data  information  only.  In  fact,  most  early  vision  problems  are  ill-posed  problems 
[6].  A  problem  is  said  to  be  well-posed  if  its  solution  (a)  exists,  (b)  is  unique,  and 
(c)  depends  continuously  on  the  input  data  [62].  We  can  transform  an  ill-posed 
problem  into  a  well-posed  one  by  imposing  additional  constraints  (such  as  smoothness 
constraints)  on  the  solution. 


1.1     Formulations  of  Early  Vision  Problems 

Early  vision  problems  may  be  formulated  either  in  a  deterministic  or  a  prob- 
abilistic framework.  A  popular  technique  in  the  deterministic  framework  is  based 
on  regularization  theory  [82,  80]  and  leads  to  the  minimization  of  energy  function- 
al. The  probabilistic  formulation  exploits  the  Markov  random  field  (MRF)  model  to 
characterize  the  function  being  estimated.  We  can  relate  the  smoothness  constraints 
in  the  regularization  theory  to  the  prior  distribution  of  the  MRF  model.  Both  the 
regularization  formulation  and  the  MRF  formulation  of  early  vision  problems  lead 
to  the  minimization  of  equivalent  energy  functions.  We  will  give  the  deterministic 
formulations  of  some  early  vision  problems  in  the  next  chapter. 

Discontinuities  contain  very  crucial  information  in  computer  vision.  There  are 
different  treatments  of  discontinuities  in  early  vision  research.  Early  work  on  vi- 
sual reconstruction  problems  reported  in  literature  ignores  discontinuities  altogether 
[40,  28,  62],  which  yields  undesirable  smoothing  over  discontinuity  locations.  How- 
ever, more  recently  problem  formulations  allow  for  incorporation  of  prespecified  dis- 
continuities [81,  75].  For  both  cases,  the  aforementioned  energy  function  is  convex 
and  quadratic  for  most  early  vision  problems.  Minimization  of  a  convex  quadratic 
function  can  be  achieved  by  solving  the  associated  linear  system.  In  this  thesis,  we 
present  several  efficient  and  novel  computational  algorithms  to  minimize  the  energy 
functions  for  early  vision  problems  with  prespecified  discontinuities. 

There  have  been  many  methods  proposed  to  detect  discontinuities  in  the  under- 
lying function  from  the  observed  data.    An  elegant  method  to  treat  discontinuities 
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is  via  the  introduction  of  a  set  of  binary  variables  called  the  line  process.  Each  line 
process  variable  takes  a  value  of  1  in  the  presence  of  discontinuities  and  zero  other- 
wise. Line  process  has  been  included  in  both  the  deterministic  [53,  8]  and  the  MRF 
[22]  formulations.  The  resulting  energy  function  involves  real  variables  (for  surface 
or  image)  coupled  with  binary  variables  (for  discontinuities)  and  is  therefore  a  non- 
convex  function.  Both  formulations  lead  to  problems  involving  the  minimization  of 
nonconvex  functions. 

1.2     Computational  Algorithms 

For  the  prespecified  discontinuities  case,  numerous  iterative  methods  have  been 
employed  to  minimize  the  energy  functions  namely,  Gauss-Seidel  [14],  successive 
overrelaxation  (SOR),  conjugate  gradient  (CG)  [26],  hierarchical  conjugate  gradi- 
ent (HCG)  [75],  and  multigrid  methods  [79],  the  slowest  being  Gauss-Seidel,  taking 
0(N2)  time  to  converge  (  iV(=  n2)  is  the  number  of  discretization  )  [26]  while,  the 
fastest  being  multi-grid  methods  taking  O(N)  time  to  converge  for  some  regular  class 
of  problems  [29].  Although  the  multi-grid  techniques  have  been  applied  successfully 
to  a  general  class  of  problems,  the  proofs  of  convergence  are  restricted  to  a  special 
class  of  problems  [29,  9,  91].  For  these  results  (  posed  as  solution  for  PDE  problems), 
discontinuity  in  the  solution  is  not  considered.  Therefore,  inclusion  of  discontinuity 
into  the  problem  is  bound  to  deteriorate  the  convergence  speed  of  multigrid  methods. 
In  addition,  the  construction  of  multigrid  needs  to  be  modified  for  nonrect angular 
regions.  The  modification  becomes  quite  complicated  for  implementation. 


Direct  solutions  to  the  associate  linear  systems  from  minimization  of  the  energy 
functions  are  based  on  a  theorem  called  the  capacitance  matrix  theorem  which  was 
stated  and  proved  in  Buzbee  et  al.  [10].  However,  this  capacitance  matrix  theorem 
is  restricted  to  some  early  vision  problems  and  on  the  design  on  numerical  algo- 
rithms [45].  The  direct  solutions  using  the  capacitance  matrix  theorem  employ  the 
Fourier- Toeplitz  method  in  combination  with  LU  decomposition  [72],  the  Cholesky 
decomposition  [10]  or  the  conjugate  gradient  [65]  technique.  The  Fourier- Toeplitz 
method  requires  0(N  log  N)  time  and  the  LU  decomposition,  Cholesky  factoriza- 
tion or  the  conjugate  gradient  require  0(n3)  =  0(Ny/N),  thus  making  the  overall 
complexity  0(N\/N). 

In  this  thesis,  we  present  two  efficient  algorithms  for  the  minimization  of  the  en- 
ergy functions  in  early  vision.  One  is  a  generalized  capacitance  matrix  algorithm  and 
the  other  is  an  adaptive  preconditioning  technique  in  conjunction  with  a  conjugate 
gradient  or  nonlinear  conjugate  gradient  algorithm.  The  development  of  the  gener- 
alized capacitance  matrix  algorithm  is  based  on  the  generalized  capacitance  matrix 
theorems  that  we  state  and  prove  in  this  thesis.  In  our  algorithm,  we  first  transform 
the  linear  system  into  a  Lyapunov  matrix  equation  or  a  cascade  of  two  Lyapunov 
matrix  equations  with  an  appropriate  right-hand  side  (RHS).  This  RHS  is  obtained 
by  solving  an  associated  linear  system  with  a  capacitance  matrix.  The  Lyapunov  ma- 
trix equations  are  solved  using  the  alternating  direction  implicit  method  (ADI)  while 
the  solution  to  the  capacitance  matrix  linear  system  is  obtained  using  a  modified 


bi-conjugate  gradient  technique.  We  prove  that  the  ADI  method  takes  a  constant 
number  of  iterations  with  each  iteration  taking  0(N)  time. 

Our  adaptive  preconditioning  technique  is  used  in  conjunction  with  a  conjugate 
gradient  or  nonlinear  conjugate  gradient  algorithm  to  minimize  the  energy  function. 
This  adaptive  preconditioner  is  constructed  in  a  wavelet  basis  and  adapted  to  the 
spectral  characteristics  of  the  imposed  smoothness  constraint  as  well  as  the  data 
constraint  in  the  energy  function.  The  construction  of  the  preconditioner  makes 
use  of  the  division  of  frequency  domain  property  in  a  wavelet  basis.  Previously, 
there  have  been  some  preconditioners  proposed  in  computer  vision  literature  [75,  61, 
92].  However,  they  are  either  not  adapted  to  the  problem  [75,  92]  or  constructed 
inappropriately  [61].  We  empirically  show  the  superiority  of  our  preconditioner  over 
other  preconditioners  on  various  early  vision  problems. 

To  solve  the  early  vision  problems  with  discontinuity  detection,  both  regular- 
ization  and  MRF  formulations  lead  to  the  minimization  of  a  coupled  (binary-real) 
nonconvex  energy  function.  Numerous  algorithms  have  been  proposed  to  solve  this 
coupled  nonconvex  minimization  problem.  They  can  be  categorized  into  the  deter- 
ministic methods  [8]  [44]  [21]  and  the  stochastic  methods  [22]  [51].  Most  of  the  deter- 
ministic methods  can  only  deal  with  simple  constraints  on  the  line  process  [8,  21,  31]; 
in  addition,  they  can  not  find  the  global  minimum  solution.  The  stochastic  optimiza- 
tion methods  are  primarily  the  simulated  annealing  (SA)  type  algorithms.  Simulated 
annealing  is  a  general  global  optimization  algorithm,  but  its  convergence  rate  is  very 
slow,  thus  making  it  impractical  for  many  applications. 


In  this  thesis,  we  will  briefly  discuss  existing  methods  for  this  coupled  noncon- 
vex  minimization  problem  and  propose  a  novel  hybrid  search  algorithm  as  a  possible 
solution  toward  solving  this  problem.  This  hybrid  search  algorithm  which  is  a  combi- 
nation of  a  stochastic  and  a  deterministic  search  techniques.  For  the  stochastic  search, 
we  propose  an  informed  genetic  algorithm  (GA)  whereas  for  the  deterministic  search, 
we  employ  either  one  of  the  two  deterministic  algorithms  mentioned  above.  Some 
promising  preliminary  experimental  results  of  applying  this  algorithm  on  sparse  data 
surface  reconstruction  along  with  discontinuity  detection  problems  are  presented. 

1.3     Main  Contributions  of  this  Thesis 

This  thesis  contains  the  following  four  main  contributions: 

1.  A  new  theoretical  and  algorithmic  framework  based  on  the  capacitance  matrix 
method  is  proposed  for  solving  elliptic  partial  differential  equations  arising  in 
early  vision  and  engineering  problems.  The  algorithm  is  based  on  the  develop- 
ment of  the  generalized  capacitance  matrix  theorems  which  we  state  and  prove 
in  this  thesis.  We  make  use  of  the  ADI  method  as  a  fast  elliptic  solver  in  this 
algorithm  and  prove  that  it  takes  a  constant  number  of  iterations  to  converge 
within  a  given  error  tolerance  with  each  iteration  taking  O(N)  time. 

2.  A  novel  adaptive  preconditioning  algorithm  using  a  wavelet  transform  is  devel- 
oped for  solving  ill-conditioned  large  systems  of  equations  arising  in  regularized 
solutions  of  ill-posed  problems  in  early  vision  and  other  domains.  Our  pre- 
conditioner  is  constructed  by  approximating  the  spectral  characteristics  of  the 
imposed  smoothness  as  well  as  data  constraints.  This  spectral  approximation  is 


achieved  by  using  the  division  of  frequency  domain  property  in  a  wavelet  basis. 
By  using  this  construction,  our  preconditioner  is  adapted  to  to  the  smoothness 
constraint  as  well  as  data  constraint. 

3.  Two  robust  algorithms  are  proposed  for  computing  optical  flow.  The  first  is  a 
modified  gradient-based  regularization  method,  and  the  other  is  an  SSD-based 
regularization  method.  In  our  modified  gradient-based  method,  we  selectively 
combine  the  image  flow  constraint  and  the  contour-based  flow  constraint  into 
the  data  constraint  in  a  regularization  framework  to  amend  the  errors  in  the 
image  flow  constraint  caused  by  brightness  discontinuities.  Our  SSD-based 
regularization  method  uses  the  SSD  (Sum  of  Squared  Differences)  measure  as 
the  data  constraint  in  a  regularization  framework.  Experimental  results  for 
these  two  algorithms  are  given  to  demonstrate  their  superior  performance. 

4.  A  novel  hybrid  (stochastic  +  deterministic)  search  algorithm  is  proposed  to 
solve  the  nonconvex  optimization  problems  for  early  vision  involving  disconti- 
nuity detection.  This  hybrid  search  algorithm  consists  of  an  informed  genetic 
algorithm  (GA)  and  either  one  of  the  two  aforementioned  numerical  algorithms. 
The  informed  GA  is  used  for  the  line  process  only  in  the  outer  loop  of  our  algo- 
rithm, while  the  deterministic  algorithm  is  employed  to  solve  the  subproblems 
of  minimizing  the  energy  function  given  a  fixed  line  process  configuration. 

The  efficiency  of  the  proposed  algorithms  is  demonstrated  through  experiments 
on  several  early  vision  problems,  including  surface  reconstruction,  shape  from  shad- 
ing, and  optical  flow  computation.  In  addition  to  computer  vision  applications,  the 
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theory  and  algorithms  developed  in  this  thesis  can  also  be  applied  to  many  other 
areas,  such  as  cartography/terrain  mapping,  reverse  engineering,  MPEG  image  se- 
quence compression,  material  science  applications  (velocity  estimation  of  dry  granular 
flows),  engineering  applications  and  mathematical  physics. 

1.4     Thesis  Overview 

The  rest  of  this  thesis  is  organized  as  follows. 

In  chapter  2,  the  deterministic  and  probabilistic  frameworks  of  the  early  vision 
problems  are  briefly  reviewed  and  the  formulations  of  some  early  vision  problems 
namely,  surface  reconstruction,  shape  from  shading  and  optical  flow  computation, 
are  given.  The  deterministic  framework  is  primarily  based  on  the  regularization 
formulation.  For  the  probabilistic  framework,  we  discuss  the  maximum  a  posteriori 
(MAP)  estimation  in  a  coupled  Markov  Random  Field  (MRF)  model. 

In  chapter  3,  the  generalized  capacitance  matrix  theorems  are  stated  with  com- 
plete proof.  These  theorems  are  generalizations  of  using  Sherman- Morrison-Woodbury 
formula  [26]  to  solve  linear  systems  and  the  capacitance  matrix  theorem  proposed  by 
Buzbee  et  al.  [10].  The  structure  of  the  our  generalized  capacitance  matrix  algorithm 
can  be  obtained  directly  from  these  generalized  theorems. 

Chapter  4  gives  the  complete  generalized  capacitance  matrix  algorithm  which 
involves  using  the  ADI  method  for  solving  the  Lyapunov  matrix  equations  and  using 
the  modified  bi-conjugate  gradient  (BCG)  algorithm  for  solving  the  associate  linear 
system  with  a  capacitance  matrix.  We  consider  the  membrane  smoothness,  thin-plate 
smoothness  or  a  linear  combination  of  the  two  in  our  algorithm.  We  demonstrate  the 


efficiency  of  this  algorithm  by  applying  it  to  surface  reconstruction  and  shape  from 
shading  problems. 

In  chapter  5,  the  new  adaptive  preconditioning  technique  is  presented  as  an 
efficient  solution  to  general  early  vision  problems.  Our  adaptive  preconditioner  is 
constructed  in  a  wavelet  basis  based  on  the  approximation  to  the  spectral  character- 
istics of  the  smoothness  and  data  constraints.  We  empirically  show  the  superiority  of 
this  adaptive  preconditioner  over  other  preconditioners  previously  proposed  in  vision 
literature  through  experiments  on  surface  reconstruction,  shape  from  shading  and 
optical  flow  computation. 

Two  new,  robust  and  efficient  algorithms  for  optical  flow  estimation  are  pre- 
sented in  chapter  6.  The  first  is  a  modified  gradient- based  algorithm  which  combines 
the  image  flow  constraint  and  the  contour-based  flow  constraint  into  the  data  con- 
straint in  a  regularization  framework  to  amend  the  errors  in  the  image  flow  constraint 
caused  by  brightness  discontinuities.  We  employ  the  adaptive  preconditioned  conju- 
gate gradient  algorithm,  presented  in  chapter  5,  to  solve  the  linear  system  for  this  new 
formulation.  The  second  is  an  SSD-based  regularization  method  that  uses  the  SSD 
measure  as  the  data  constraint  in  a  regularization  framework.  The  preconditioned 
nonlinear  conjugate  gradient  with  a  modified  search  direction  scheme  is  developed  to 
minimize  the  resulting  energy  function.  Experimental  results  for  these  two  algorithms 
are  given  to  demonstrate  their  performance. 
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In  chapter  7,  we  list  several  directions  for  conducting  future  research  based  on 
the  research  reported  in  this  thesis.  For  the  early  vision  with  discontinuity  detec- 
tion problem,  we  discuss  numerous  existing  techniques  and  propose  a  new  hybrid 
search  algorithm  as  a  possible  approach  toward  solving  this  problem.  By  includ- 
ing line  process  into  the  optimization  of  the  energy  function,  the  early  vision  along 
with  discontinuity  detection  problem  is  formulated  as  the  minimization  of  a  coupled 
(binary-real)  nonconvex  energy  function.  Our  hybrid  search  algorithm  consists  of  a 
stochastic  and  a  deterministic  search  techniques.  For  the  stochastic  search,  we  pro- 
pose an  informed  genetic  algorithm  (GA)  as  a  global  minimizer  for  the  binary  line 
process  only,  whereas  for  the  deterministic  search,  we  employ  either  one  of  the  two 
deterministic  algorithms  mentioned  above  to  solve  the  minimization  problem  for  each 
fixed  line  process  configuration  inside  the  GA.  Promising  preliminary  experimental 
results  of  using  this  algorithm  on  the  sparse  data  surface  reconstruction  problems  are 
given. 

Finally,  we  summarize  the  results  of  this  thesis  in  chapter  8. 


CHAPTER  2 
PROBLEM  FORMULATIONS 


There  have  been  two  classes  of  formulations  proposed  for  the  early  vision  prob- 
lems; one  is  the  deterministic  formulation,  and  the  other  is  the  probabilistic  one.  The 
deterministic  approach  is  based  on  the  regularization  theory  proposed  by  Tikhonov 
[82].  The  regularization  method  introduces  a  stabilizing  operator  to  well-pose  the 
original  problem  and  leads  to  a  variational  principle.  The  most  popular  probabilistic 
formulation  to  early  vision  problems  involves  the  use  of  an  MRF  (Markov  Random 
Field)  model.  Although  there  is  some  correspondence  between  the  aforementioned 
stabilizing  operator  and  the  MRF  model,  the  MRF  model  is  known  to  be  more  ver- 
satile to  model  surfaces  or  images  than  the  deterministic  model  [51]  [74]. 

2.1     Deterministic  Framework 

Variational  formulations  for  various  early  vision  problems  have  been  reported 
in  [62,  6].  These  formulations  make  use  of  regularization  theory.  In  a  regularization 
framework,  generic  smoothness  assumptions  are  imposed  on  the  solution  space  prior 
to  attempting  any  functional  minimization.  The  smoothness  constraints  are  well 
characterized  by  a  class  of  generalized  multi-dimensional  spline  functionals  [80].  The 
formulations  involve  minimization  of  an  energy  functional  £,  which  is  the  sum  of  the 
energy  contribution  from  the  smoothness  constraint  (£s)  and  the  data  constraint  (£d) 
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i.e., 

Find  u   such   that   £(u)  =  mfveli   6(v)  (2.1) 

where,  H  defines  the  linear  admissible  space  of  smooth  functions  defined  on  3ft2  and 
the  functional 

S(v)  =  V(v)  +  XS{v)  (2.2) 

Where,  A  is  called  the  regularization  parameter  that  controls  the  contribution  of  the 
smoothness  term,  S(v)  :  H  ►-*  3ft  is  a  functional  on  H  that  is  a  measure  of  smoothness 
of  an  admissible  function  v(x,y),  and  V{v)  :  H  *-*  3ft  is  a  functional  that  measures 
the  discrepancy  between  the  observation  model  and  the  observed  data  [80]. 

To  solve  the  variational  problem,  a  finite  difference  method  [37]  or  a  finite  el- 
ement method  [78]  can  be  used  to  discretize  the  problem  domain.    The  infinite- 
dimensional  minimization  problem  in  equation  2.1  can  be  reduced  to  a  finite-dimensional 
minimization  problem  as  follows: 

min£(u)   with  E(u)  =  D(u,d)  +  A5(u)  (2.3) 

where  u  €  3ft"2  xl  is  the  discretized  surface,  d  e  3ftn  xl  is  the  data  vector,  and  the 
discretization  mesh  is  n  x  n.  In  general,  the  observation  model  can  be  nonlinear, 
e.g.  the  shape  from  shading  problem  with  a  nonlinear  reflectance  map  function. 
Let's  denote  this  model  by  K^u  =  d  ,  where  the  operator  Kj  is  a  matrix  when  the 
model  is  linear,  and  choose  the  stabilizing  (  or  smoothing  )  operator  as  the  first-order 
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derivative  along  x  and  y,  then 


D(u,d)    =    UKrfU-dH'  (2.4) 

5(u)    =    Vksu  (2.5) 


For  the  choice  of  first-order  differential  stabilizer,  i.e.  the  membrane  smoothness,  the 
operator  Ks  is  a  Laplacian  matrix. 

The  regularization  method  imposes  the  smoothness  constraint  over  the  entire 
domain.  It  also  smoothes  out  the  discontinuities  which  is  very  undesirable.  Incorpo- 
rating the  discontinuity  detection  into  the  early  vision  problem,  the  energy  function 
to  be  minimized  can  be  modified  as  [8] 

£(u,h,v)    =    ||  Kdu  -  d  ||2  +  J3((u<+W  -  uU)J)2(l  -  hitj)  +  (t*o+1  -  uUJ)2(l  -  vtj)) 
+  £V(h,v)  (2-6) 


hj 


where  /jtJ  and  u,j  are  the  horizontal  and  vertical  line  processes,  respectively,  which 
are  binary  variables  with  a  value  1  denoting  a  vertical  or  horizontal  discontinuity 
at  that  location  and  a  0  otherwise.  The  line  energy  V(h,v)  is  a  term  that  may 
be  used  to  penalize  the  inclusion  of  discontinuities  or  to  impose  the  smoothness  of 
the  line  process.  Most  of  the  deterministic  formulations  take  very  simple  functions 
for  V(h,v)  merely  based  on  some  heuristics;  for  example,  Blake  and  Zisserman  [8] 
take  V(h,  v)  =  hij  +  u,j,  which  has  no  interaction  between  individual  line  processes 
variables. 
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The  energy  in  equation  2.6  is  a  nonconvex  function  of  real  variables  coupled 
with  boolean  variables.  It  is  difficult  to  solve  a  nonconvex  minimization  problem 
since  traditional  gradient  decent  methods  are  easily  trapped  in  local  minima. 

The  deterministic  formulation  leads  to  satisfactory  results  for  some  early  vision 
problems,  but  this  formulation  is  not  general  enough  to  include  arbitrary  constraints 
for  the  line  process.  We  will  see  in  the  next  section  that  the  probabilistic  formu- 
lation using  the  MRF  model  is  more  flexible  in  this  context  than  the  deterministic 
formulation. 

2.2     Probabilistic  Framework 

MRF  (Markov  Random  Field)  is  a  generalization  of  Markov  process  in  ID  to 
2D.  To  define  an  MRF  [22],  we  have  to  define  a  neighborhood  system  first.  Let  S  be 
a  finite  set  ofnxn  sites,  and  G  =  {Gij,  (i,j)  G  S}  be  a  neighborhood  system  for  S 
such  that 

(1)  (»,i)  t  Gij,  for  all  (ij)  €  S. 

(2)  (i,j)  €  GkJ,  if  and  only  if  (*,  /)  €  Gy,  for  all  (i,j),  (k,  I)  <E  S. 

Let  F  =  {Fij,  (i,j)  €  S}  be  a  random  field,  which  is  a  family  of  random  variables 
indexed  by  the  two-dimensional  site  (i,j)  €  S.  Any  possible  sample  realization 
f  =  (/iti,. .  .,/„,„)  is  called  a  configuration  of  the  field.  Let  17  be  the  set  of  all 
possible  configurations,  i.e.  the  sample  space,  and  P  be  a  probability  measure  on  Cl. 
Then  F  is  an  MRF  with  respect  to  G  if 

(1)  P(F  =  f)  >  0,  for  all  f  £  H. 
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(2)  P(Fitj  =  fa  |  Fw  =  fk,h(k,l)  /  (•,/))  =  P(Fij  =  A;  I  Fk,i  =  fk,i,(k,l)  e 

It  is  not  clear  how  to  find  a  probability  distribution  function  for  an  MRF  directly 
from  the  above  definition.  Fortunately,  the  Hammersley- Clifford  theorem  [22]  states 
that  F  is  an  MRF  on  S  with  respect  to  the  neighborhood  system  G  if  and  only  if 
the  probability  distribution  is  a  Gibbs  distribution  with  respect  to  G.  Therefore,  we 
can  construct  an  MRF  by  designing  a  Gibbs  distribution  function  as  the  distribution 
function  of  the  MRF. 

To  give  a  definition  of  a  Gibbs  distribution,  we  have  to  define  a  "clique"  first. 
A  subset  C  C  S  is  a  clique  if  every  pair  of  distinct  sites  in  C  are  neighbors.  Let  C 
denote  the  set  of  all  cliques. 

A  Gibbs  distribution  relative  to  {S,  G}  is  a  probability  measure  7r  on  ft  such 
that 

*M  =  iexp-^>/r  (2.7) 

where  Z  is  the  normalizing  constant  (  also  called  partition  function),  T  is  a  parameter, 
and  the  energy  function  U(u>)  is  of  the  form 


U{U)  =  £  Vfc(W)  (2.8) 

cec 


The  function  Vc{u)  is  called  the  clique  potential.  Its  value  is  determined  by  u)^  with 
(ij)  G  C. 
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For  the  sake  of  exposition,  let's  consider  the  surface  reconstruction  problem 
here,  but  the  discussion  to  follow  should  apply  to  most  early  vision  problems  with 
minor  modifications.  A  coupled  MRF  model  may  be  constructed  for  the  surface 
reconstruction  problem  by  defining  an  MRF,  F,  for  surface  and  a  boolean  MRF,  L, 
for  line  process.  The  prior  probability  can  be  written  as 


P(F  =  f,L  =  l)    =    4exP_t;(f'1)/:r  (2-9) 

U(f,l)    =    £Vc(f,l)  (2.10) 

c 


The  parameter  T  is  assumed  to  be  1  for  simplicity.  The  prior  energy  £/(f,  1)  is 
similar  to  the  last  two  terms  of  i?(u,  h,  v)  in  equation  2.6,  which  are  the  smoothness 
constraints  on  the  surface  and  line  process. 

Let  the  observed  data  be  denoted  by  d,  then  the  problem  to  estimate  f  and  1 
from  d  can  be  obtained  by  the  MAP  (maximum  a  posteriori)  estimation.  The  MAP 
estimate  is  obtained  by  using  Bayes'  rule  and  maximizing  the  posterior  probability 
P(F  =  f ,  L  =  1  |  D  =  d).  Assume  the  observation  model  is  given  by  D  =  H(F)  +  N, 
where  N  is  zero  mean  white  Gaussian  random  field  with  variance  a  for  each  variable 
in  n.  After  applying  the  Bayes  rule,  we  can  see  that  the  MAP  estimation  is  equivalent 
to  maximizing  P(D  =  d  |  F  =  f ,  L  =  1)P(F  =  f ,  L  =  1),  and 

P(D  =  d  |  F  =  f ,  L  =  1)  =  (27r<72)-n2/2  exp-"d_2^f)"2  (2.11) 
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Combining  equations  2.9  and  2.11,  we  can  see  that  the  MAP  estimate  can  be  obtained 
by  minimizing  the  following  energy  function 


U(f,l\d)    =    ^||d-tf(f)||2+[/(f,l)  <2-12) 

=    -^  ||  d  -  H(f)  ||2  +  £  Vc(t  I  1)  +  £  Vc(\)  (2.13) 

to  c  c 


Investigating  equations  2.6  and  2.13,  we  can  see  that  these  two  energy  functions  are 
very  similar.  If  we  choose  the  parameters  and  the  clique  potential  to  make  these 
two  energy  functions  identical,  then,  both  the  deterministic  formulation  and  the 
MRF  formulation  give  rise  to  minimizing  the  same  energy  function.  Therefore,  the 
correspondence  between  the  two  formulations  is  obvious. 

The  advantages  of  the  probabilistic  model  over  a  deterministic  model  are  two- 
fold: 

1.  The  MRF  model  can  include  very  general  constraints  on  the  line  process,  while 
the  deterministic  model  is  restricted  in  the  types  of  constraints  considered. 
The  MRF  model  regards  the  imposed  constraints  as  a  prior  probability,  which  is 
determined  by  the  clique  potential.  In  fact,  we  can  estimate  the  clique  potential 
functions  for  different  object  models  instead  of  ad  hoc  or  heuristic  choices  [54]. 

2.  A  probabilistic  model  can  provide  second  (covariance)  or  higher  order  statistics 
of  the  estimate  in  addition  to  the  MAP  estimate.  The  higher  order  statis- 
tical information  is  very  useful  in  the  context  of  dynamic  vision  [74],  sensor 
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fusion,  and  integration  of  multiple  modules  [51].  This  can  not  be  achieved  by 
a  deterministic  model. 

Therefore,  it  can  be  concluded  that  the  MRF  model  is  more  versatile  than  the 
deterministic  models.  The  efficient  computational  algorithms  proposed  in  this  thesis 
can  be  applied  to  solve  the  minimization  problems  for  both  the  deterministic  and 
probabilistic  formulations. 

2.3     Variational  Formulations  for  Some  Early  Vision  Problems 

From  the  previous  two  sections,  we  can  see  the  link  between  the  regulariza- 
tion  and  MRF  formulations,  which  lead  to  the  minimization  of  equivalent  energy 
functions.  In  this  section,  we  give  the  deterministic  formulation  of  some  early  vi- 
sion problems  namely,  surface  reconstruction,  shape  from  shading  and  optical  flow 
computation.  The  corresponding  probabilistic  formulation  can  be  directly  obtained 
from  the  equivalence  between  these  two  formulations  [51].  For  ease  of  exposition,  the 
location  of  surface  and  orientation  discontinuities  (if  any)  are  assumed  to  be  known 
in  the  below  discussion.  However,  we  will  discuss  the  issue  of  discontinuity  detection 
in  chapter  7. 

Variational  formulations  for  various  early  vision  problems  have  been  reported 
in  [79]  and  references  therein.  These  formulations  make  use  of  the  popular  theory 
of  regularization.  In  a  regularization  framework,  generic  smoothness  assumptions 
are  imposed  on  the  solution  space  prior  to  attempting  any  functional  minimization. 
The  smoothness  constraints  are  well  characterized  by  a  class  of  generalized  multi- 
dimensional spline  functionals  [80].     The  formulations  involve  minimization  of  an 
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energy  functional  £,  which  is  the  sum  of  the  energy  contribution  from  the  smoothness 
constraint  (S)  and  the  data  constraint  (V)  (see  [80,  75]  for  details  of  this  formulation). 

2.3.1     Surface  Reconstruction 

For  the  surface  reconstruction  problem,  we  are  faced  with  recovering  surface 
shape  from  either  sparse  or  dense  range  data.  For  a  first  order  stabilizer,  the  smooth- 
ness constraint  is  given  by 

S(v)  =  jja(vl  +  v2y)  dxdy,  (2.14) 

where  u(x,y)  is  the  admissible  function  and  vx,  vy  its  partial  derivatives  assumed  to 
be  small.  For  the  case  of  a  second  order  stabilizer,  we  have  the  following  smoothness 
constraint 

S(v)  =  j  Jyxx  +  2v2xy  +  v2yy)  dxdy,  (2.15) 

The  first  and  second  order  stabilizer  can  be  combined  to  form  the  controlled-continuity 
stabilizers  [80].  To  the  smoothness  constraint,  we  add  the  data  constraints  in  the 
form  of  penalty  terms.  The  following  penalty  term  which  measures  the  discrepancy 
between  the  surface  and  data  weighted  by  the  uncertainty  in  the  data  may  be  used, 


V(v)=1-J2ci(v(xhy,)-dt)2  (2.16) 
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Where,  d,  are  the  depth  data  points  specified  in  the  domain  il  and  c,  are  the  un- 
certainty associated  with  the  data.  The  goal  is  to  find  a  u  that  minimizes  the  total 
potential  energy  £(v)  =  S(v)  +  V(v). 

The  numerical  solution  to  the  above  problem  is  computed  by  discretizing  the 
functionals  S(v)  and  V{v)  using  finite  element  techniques  [75].  The  resulting  energy 
function  is  a  quadratic  in  x  (a  discretization  of  v)  given  by,  E(x)  =  |xrKx  — xTb  +  c. 
Where,  K  is  a  very  large  and  sparse  N  x  N  matrix  [79],  with  N  being  the  number 
of  discretization  nodes.  The  minimum  x*  of  this  energy  function  is  found  by  solving 
the  large  sparse  linear  system  Kx  =  b.  We  present  two  very  fast  solutions  to  this 
problem  in  this  thesis.  One  is  a  generalized  capacitance  matrix  algorithm  and  the 
other  is  a  novel  adaptive  preconditioning  technique,  which  will  be  given  in  subsequent 
chapters 

2.3.2     Shape  from  Shading 

Numerous  solution  methods  using  variational  principle  formulations  of  the  shape 
from  shading  (SFS)  problem  have  been  proposed.  For  a  comprehensive  set  of  papers 
on  this  topic,  we  refer  the  reader  to  the  book  edited  by  Horn  and  Brooks  [39]  and  the 
work  in  [79,  72,  38,  76].  In  this  problem,  it  is  required  to  recover  the  shape  of  surfaces 
from  image  irradiance  which  depends  on  surface  geometry  and  reflectance,  scene 
illuminance  and  imaging  geometry.  The  image  irradiance  can  be  expressed  directly  as 
a  function  of  the  surface  orientation  if  illuminance,  reflectance  and  imaging  geometry 
are  assumed  to  be  constant.  Two  deterministic  formulations  for  SFS  problem  are 
discussed  in  the  following. 
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PDE  formulation 

The  shape  from  shading  problem  is  commonly  posed  as  a  nonlinear,  first  or- 
der partial  differential  equation  in  two  unknowns,  called  the  image  irradiance  equa- 
tion, E{x,y)  =  R{p,q)  [39],  where  E(x,y)  is  the  image  irradiance  at  a  point  (x,y), 
p  =  vx,q  —  vy  are  first  partials  of  the  surface  function  v(x,y),  and  R(p,q)  is  the  re- 
lation between  surface  orientation  (p,q)  and  image  irradiance  E(x,y).  To  overcome 
the  ambiguity  caused  along  occluding  contours  in  the  gradient  space  parameteriza- 
tion (p,  </),  a  reparameterization  of  surface  orientation  in  terms  of  a  stereographic 


mapping:  /  =  2ap,  g  =  laq  with  a  =  1/(1  +  >/l  +  p2  +  q2)  is  used.  With  this  repa- 
rameterization, the  overall  energy  function  (sum  of  stabilizer  and  data  energies)  to 
be  minimized  is  expressed  as 

£(f,9)  =  A//n(/'  +  fy]  +  ^l+92y)dxdy  +  J  JQ[E(x,y)  -  R(f,g)]2dxdy.    (2.17) 

The  Euler  Lagrange  equations  which  express  the  necessary  condition  for  a  minimum 
are  given  by  the  system  of  coupled  PDEs 


A/    =    \[R{f,g)-E{x,y))R, 
Ag    =    \[R(f,g)-E(x,y)]Rg 


(2.18) 


The  above  equations  may  be  discretized  and  solved  using  the  algorithm  of  Sim- 
chony  et  al.  [72],  which  enforces  integrability  condition  on  p  and  q.  The  method 
involves  solving  three  different  discretized  Poisson  equations  namely,  Kxi    =  bi, 
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Kx2  =  b2,  and  K1X3  =  b3,  where  K  is  the  stiffness  matrix  obtained  by  discretizing 
the  Laplacian  operator  (on  the  LHS  of  the  equation  2.18)  with  Dirichlet  boundary 
conditions  and  Ki  is  the  stiffness  matrix  obtained  by  discretizing  the  Poisson  equation 
for  the  height  from  orientation  problem  i.e., 

AZ  =  Px  +  qy  (2.19) 

with  Neumann  boundary  conditions.  The  vectors  bi  and  b2  are  the  appropriate 
RHSs  obtained  by  discrete  computation  of  the  RHS  of  equation  2.18  at  the  current 
estimated  value  of  (f,g).  b3  is  the  discrete  form  of  the  RHS  of  the  above  height  from 
orientation  equation  with  the  estimated  (p,  q)  from  equation  2.18.  The  structure  of 
the  stiffness  matrices  K  can  be  analyzed  using  the  computational  molecules  [81].  In 
this  dissertation,  we  apply  our  generalized  capacitance  matrix  algorithm  to  solve  the 
above  discretized  Poisson  equations. 

Energy  minimization  formulation 

The  other  formulation  for  the  SFS  problem  is  to  minimize  the  total  energy  func- 
tional containing  the  data  constraint,  the  integrability  constraint  and  the  smoothness 
constraint.  The  minimization  is  taken  with  respect  to  the  surface  orientation  (p,  q) 
and  the  surface  height  Z  altogether.  The  total  energy  functional  £(Z,p,  q)  is  the  sum 
of  the  smoothness  constraint  and  the  penalty  terms  in  this  problem,  i.e., 

£(Z,P,q)    =    \J  Jq(pI  +  P2y)  +  (ql  +  q2y)dxdy  +  ^  J  J(ZX -pf +  (Zy -qfdxdy 
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+ 


J  jn[E(x,y)-R(p,q)]2dxdy. 


The  first  term  in  the  above  equation  is  a  weak  smoothness  constraint  and  the  coeffi- 
cient A  controls  its  contribution  toward  the  final  solution.  The  second  term  enforces 
the  integrability  constraint,  ensuring  that  the  gradients  (p,  q)  of  the  recovered  surface 
Z  match  those  of  a  valid  surface.  Zx,  Zy  are  the  first  partials  of  the  surface  Z(x,  y). 
The  above  energy  can  be  discretized  directly  by  using  finite  element  techniques 
however,  the  same  set  of  discrete  domain  equations  can  be  arrived  at  via  the  use  of 
finite  difference  approximations  for  this  application  [76].  The  discrete  energy  is  given 
by 


E  =  ^(%!')-%?))2+^(^-l')2+(2r?)2+^(p!+^^?v2)  (2-2°) 


where  the  summations  are  over  all  the  pixels  in  the  image.   In  vector  notation,  the 
above  equation  can  be  written  as 


E(u)  =  |uTKu  -  bu  +  C  (2.21) 


Where,  u  =  [zTpTqT]  is  a  concatenation  of  the  vector  representation  of  the  fields 
Z,p  and  q.  The  minimum  of  the  above  quadratic  is  obtained  by  solving  Ku  =  b  with 
Ku  containing  linear  u  terms  and  the  constant  and  the  nonlinear  terms  form  the  b 
vector.  This  is  an  (N  x  N)  sparse  nonlinear  system  which  is  usually  solved  using 
iterative  methods.  These  iterative  methods  to  solve  the  nonlinear  system  normally 
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have  the  following  procedures  in  each  iteration:  (1)  compute  the  b  vector  using 
the  current  u,  (2)  update  the  solution  u  by  treating  b  as  a  constant.  In  the  past, 
multigrid  techniques  have  been  employed  in  solving  this  nonlinear  system  [79]  wherein 
a  hierarchy  of  problems  at  different  resolutions  needs  to  be  explicitly  constructed. 
Szeliski  [76]  presented  a  fast  shape  from  shading  algorithm  which  employs  hierarchical 
basis  for  preconditioning  the  conjugate  gradient  algorithm  used  to  solve  the  above 
nonlinear  system.  Unlike  the  multigrid  methods,  this  technique  does  not  require 
the  explicit  construction  of  a  multiresolution  pyramid  of  problems.  In  this  thesis  we 
apply  our  adaptive  preconditioning  technique  that  leads  to  a  faster  convergence  than 
the  one  in  Szeliski  [76].  Our  method  makes  use  of  the  spectral  characteristics  of  the 
data  constraint,  the  imposed  smoothness  constraint  and  the  integrability  constraint. 
This  spectral  function  is  then  used  to  modulate  the  frequency  characteristics  of  the 
chosen  wavelet  basis  leading  to  the  construction  of  the  preconditioner. 

2.3.3     Optical  Flow  Computation 

A  third  early  vision  problem  that  we  consider  in  this  dissertation  is  the  compu- 
tation of  optical  flow  from  an  image  sequence.  Many  techniques  for  computing  optical 
flow  have  been  proposed  in  literature.  These  can  be  classified  into  gradient-based, 
correlation-based,  energy-based,  and  phase-based  methods  [3].  Recently,  Barron  et 
al.,  [3]  have  presented  a  comprehensive  survey  of  various  optical  flow  computation 
techniques  along  with  a  quantitative  comparison  of  them.  We  note  that  the  survey  did 
not  do  any  comparison  of  the  computation  time  required  by  the  various  algorithms. 
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Two  new  and  robust  algorithms  namely,  a  modified  gradient-based  algorithm 
and  an  SSD-based  regularization  algorithm,  for  accurate  optical  flow  computation 
are  presented  in  this  dissertation.  In  the  modified  gradient-based  algorithm,  we 
propose  to  selectively  combine  the  image  flow  constraint  and  the  contour-based  flow 
constraint  into  the  data  constraint  in  a  regularization  framework  to  amend  the  errors 
in  the  image  flow  constraint  caused  by  the  brightness  discontinuities.  The  image  flow 
constraint  is  disabled  in  the  neighborhood  of  discontinuities,  while  the  contour-based 
flow  constraint  is  active  at  discontinuity  locations.  The  SSD-based  regularization 
algorithm  uses  the  SSD  measure  as  the  data  constraint  in  a  regularization  framework, 
which  leads  to  a  nonlinear  optimization  problem.  These  two  different  regularization 
formulations  are  described  below.  These  two  robust  algorithms  will  be  discussed  in 
details  in  chapter  6.  In  this  section,  we  briefly  describe  our  modified  gradient-based 
formulation. 

Our  modified  gradient-based  formulation  modifies  the  Horn  and  Schunk  for- 
mulation [40]  to  incorporate  a  separate  condition  along  discontinuities  in  the  image 
intensity  [35].  This  modification  drastically  enhances  the  performance  of  our  method 
over  other  gradient-based  methods  because  the  optical  flow  constraint  equation  is  in- 
valid along  intensity  discontinuities  and  causes  large  errors  in  the  computed  flow.  We 
get  drastically  better  and  more  robust  estimates  of  optical  flow  than  any  gradient- 
based  optical  flow  technique  reported  in  Barron  et  al.  [3]. 

Our  variational  formulation  of  the  optical  flow  problem  involves  minimizing  a 
combination  of  the  optical  flow  constraint  equation  with  a  global  smoothness  term 
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and  an  additional  constraint  term  along  edges  leading  to 

/  a((V£«u)+£t)2+A(||  Vu  ||2  +  ||  Vv  \\22)dx+J>(3((V(V2G*E)*u)+(V2G*E)t)2dx 

(2.22) 
Where  u(x,  t)  =  (u(x,t),v(x.,t))  is  the  velocity  field  to  be  estimated,  E  is  the  image 
brightness  function,  and  V2G  is  the  Laplacian  of  Gaussian  operator  [35].  The  first 
term  is  the  gradient  constraint  which  is  active  away  from  discontinuities,  the  second 
term  is  the  smoothness  constraint  on  the  velocity  field  with  A  controlling  the  contri- 
bution of  the  smoothness  term  and  the  third  term  is  the  flow  constraint  along  edges 
which  is  disabled  away  from  the  edges.  A  discrete  version  of  the  above  energy  can 
be  written  as 


J2(Exu  +  Eyv  +  Et)2  +  |  £K  +<)  +  «  +  O  (2-23) 


Where,  u  and  v  form  the  discretized  flow  vector,  ux,uy,vx  and  vy  represent  the  dis- 
cretized  partials  of  the  velocity  field,  Ex,  Ey  and  Et  correspond  to  the  discretized 
partials  in  the  x,  y  and  t  directions  respectively  -  collectively  obtained  from  the 
first  and  the  third  terms  of  the  equation  2.22.  Note  that  Ex,  Ey  and  Et  are  the 
discretized  partials  of  the  brightness  function  E  when  the  location  of  differentiation 
is  away  from  discontinuities,  and  they  represent  the  discretized  partials  of  V2G  *  E 
when  the  differentials  are  taken  at  the  discontinuity  locations.  The  minimization  of 
this  discretized  energy  leads  to  solving  a  linear  system  of  equations  Ku  =  b  where 
K  contains  the  discretized  differential  (smoothness)  operator  along  with  Ex,  Ey  and 
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the  b  contains  the  terms  corresponding  to  the  optical  flow  constraint  equation.  The 

— 2        — 2 

gradient  constraint  in  the  above  energy  function  implies  the  weighting  (Ex  +  Ey)  on 
(u,v)  at  each  discretized  spatial  location.  This  weighting  is  not  reasonable  since  the 
gradient  constraints  at  locations  of  high  gradient  are  not  necessarily  reliable.  In  fact, 
they  tend  to  be  unreliable  because  the  brightness  gradients  close  to  discontinuities  are 
normally  higher  than  those  away  from  discontinuities.  Therefore,  we  eliminate  this 
weighting  for  each  gradient  constraint  via  normalization.  To  have  a  uniform  contri- 
bution at  each  pixel  from  the  flow  constraint,  we  normalize  the  entries  Ex,  Ey,  ExEy 

2        — 2 

in  the  matrix  K  with  Ex  +  Ey. 

To  solve  the  above  quadratic  and  convex  energy  minimization  problem,  we  em- 
ploy our  adaptive  preconditioned  conjugate  gradient  algorithm  presented  in  this 
dissertation.  This  novel  adaptive  preconditioning  technique  is  applicable  to  any 
gradient-based  optical  flow  computation  method  that  uses  a  smoothness  constraint 
involving  the  2D  Laplacian  operator.  The  efficiency  as  well  as  accuracy  of  our  adap- 
tive preconditioning  technique  and  new  optical  flow  algorithms  will  be  demonstrated 
through  experimental  results  on  synthetic  and  real  image  sequences  in  chapters  5  and 


CHAPTER  3 
GENERALIZED  CAPACITANCE  MATRIX  THEOREMS 


3.1     Introduction 

There  are  numerous  applications  of  the  Sherman- Morrison- Woodbury  formula  in 
various  fields  [30]  [85].  These  applications  can  be  classified  into  two  categories.  One  is 
to  compute  the  inverse  of  a  matrix  that  has  low-rank  difference  w.r.t.  another  matrix 
whose  inverse  is  available  or  can  be  efficiently  computed;  the  other  is  to  solve  a  linear 
system  whose  matrix  differs  from  a  well-structured  matrix  by  a  low-rank  matrix  and 
very  efficient  methods  exist  to  solve  the  linear  system  with  the  well- structured  matrix. 
The  latter  case  is  also  called  the  capacitance  matrix  algorithm.  This  algorithm  has 
been  widely  used  for  solving  linear  systems  arising  from  the  discretization  of  the 
boundary  value  problems,  primarily  for  the  elliptic  partial  differential  equations.  In 
this  chapter,  we  state  and  prove  some  very  general  theorems  which  are  then  applied 
in  developing  a  new  capacitance  matrix  algorithm  in  the  next  chapter. 

When  A  €  &nxn,  U,  V  €  3JnXp,  and  both  the  matrices  A  and  (/  +  VTA_1U) 
are  nonsingular,  the  Sherman-Morrison- Woodbury  formula  [26]  gives  an  expression 
for  the  inverse  of  (A  +  UV   )  as 

(A  +  UV7)-1  =  A"1  -  A_1U(7  +  VTA-1V)-1yTA-\  (3.1) 
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The  above  assumptions  imply  that  the  matrix  (A  +  UVT)  is  also  nonsingular.  The 
matrix  (/  +  VTA_1U)  is  called  the  capacitance  matrix  and  is  denoted  by  C.  Note 
that  a  similar  formula  can  be  obtained  for  the  inverse  of  (A  +  UGVT),  where  the 
matrix  G  6  3£pXp  is  nonsingular  [30]. 

Let  B  =  A  +  UVT,  with  A,  B,  and  C  being  nonsingular,  then  the  linear 
system  Bx  =  b  can  be  solved  using  the  traditional  capacitance  matrix  algorithm, 
which  involves  the  following  steps: 

1.  Solve  Ax  =  b  for  x. 

2.  Compute  W  =  A-1U  by  solving  AW  =  U  column  by  column. 

3.  Form  the  capacitance  matrix  C  =  I  +  VrW. 

4.  Solve  C/3  =  VTx  for  0. 

5.  Compute  the  solution  x  =  x  —  W/3. 

This  capacitance  matrix  algorithm  is  derived  directly  from  the  Sherman-Morrison- 
Woodbury  formula,  and  therefore,  it  can  be  applied  only  when  A,  B,  and  C  are  all 
nonsingular,  making  it  unsuitable  for  general  linear  system  problems.  In  addition,  the 
choice  of  a  well-structured  matrix  A  which  is  close  to  the  original  matrix  is  limited 
to  be  nonsingular. 

Buzbee  et  al.  [10]  proposed  a  fast  algorithm  to  solve  the  Poisson  equation  ( 
a  second-order  elliptic  PDE  )  on  irregular  regions.  Their  algorithm  used  a  domain 
imbedding  technique  to  imbed  the  irregular  region  in  a  rectangular  region  and  utilized 
a  fast  Poisson  solver  (  on  the  rectangular  region  )  which  is  based  on  the  capacitance 
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matrix  algorithm.  The  discrete  Poisson  operator  on  the  rectangular  domain  is  the 
Laplacian  matrix,  which  is  singular  with  rank  n  —  1.  Therefore,  they  presented  a 
theorem  to  generalize  the  traditional  capacitance  matrix  algorithm  to  the  case  when 
rank(A)  =  n  —  1  and  B  is  nonsingular.  This  generalization  is  based  on  projecting  the 
right-hand-sides  (RHSs)  of  the  linear  systems  appearing  in  the  first  and  second  steps 
onto  the  range  of  A  through  a  unit  vector.  In  [58],  a  generalized  capacitance  matrix 
algorithm  was  presented  for  rank-deficient  matrices  A  and  B.  This  generalization 
is  accomplished  by  establishing  the  relationship  between  the  solutions  of  the  linear 
systems  with  matrices  A  and  B  using  their  singular  value  decompositions.  The 
resulting  algorithm  involves  the  projection  of  the  RHSs  of  the  linear  systems  with 
matrix  A  onto  its  range  through  the  eigenvectors  in  the  null  space  of  AT .  In  addition, 
two  auxiliary  matrices  A  and  B,  which  are  the  rank- augmented  versions  of  A  and  B, 
are  constructed  in  this  algorithm  to  apply  the  Sherman-Morrison-Woodbury  formula 
directly.  Hence,  this  capacitance  matrix  algorithm  requires  the  computation  of  all 
the  eigenvectors  in  the  null  spaces  of  A,  AT,  B  and  B   . 

The  capacitance  matrix  algorithm  has  also  been  employed  as  a  discrete  anal- 
ogy to  the  potential  theory  for  partial  differential  equations  by  Proskurowski  and 
Wildlund  [65,  66],  and  others.  However,  the  framework  of  their  capacitance  matrix 
algorithms  is  limited  to  solving  second-order  elliptic  boundary  value  problems. 

The  other  possibility  to  generalize  the  capacitance  matrix  algorithm  for  singular 
A  and  B  is  to  directly  apply  a  generalized  inverse  form  for  (A  +  UV  )  to  solving 
the  linear  system.  Some  works  on  the  extension  of  the  Sherman- Morrison-Woodbury 
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formula  in  equation  3.1  have  been  reported  in  [11],  [34],  [67],  and  references  therein. 
However,  these  generalized  inverse  forms  of  (A  +  UV  )  have  been  derived  under 
very  restrictive  assumptions.  In  [11],  a  generalized  inverse  form  was  given  for  the 
case  when  U  and  V  are  both  vectors,  which  means  B  is  a  rank-one  modification  of 
A.  Although  the  generalized  inverse  of  a  rank-p  modification  of  A  can  be  obtained  by 
using  the  rank-one  modification  generalized  inverse  formula  iteratively,  the  resulting 
algorithm  is  very  inefficient  in  terms  of  storage  and  computation  when  the  size  of 
the  matrix  A  is  very  large.  Henderson  and  Searle  [34]  derived  several  generalized 
inverse  forms  for  (A  +  UVT)  under  the  assumptions  that  range(\JV  )  C  range(A) 
and  both  the  matrices  A  and  UVT  are  symmetric.  Recently,  Riedel  [67]  presented  a 
generalized  inverse  form  for  (A  +  UV  )  with  a  very  restrictive  assumption,  which  is 
implied  by  the  condition  rank(B  —  A)  <  dim(null(A)).  This  condition  implies  that 
Riedel's  result  at  best  can  give  a  generalized  inverse  for  a  rank-<7  modification  of  A, 
where  q  <  dim(null(A)).  This  condition  becomes  very  restrictive  and  impractical 
when  the  dimension  of  the  null  space  of  A  is  small.  Although  the  aforementioned 
generalized  inverse  formulas  can  be  used  to  design  the  corresponding  capacitance 
matrix  algorithms,  the  associated  assumptions  restrict  them  from  being  applied  for 
general  matrices  A  and  B. 

In  this  chapter,  we  present  several  theorems  which  we  call  generalized  capaci- 
tance matrix  theorems  that  lead  to  a  generalized  capacitance  matrix  algorithm  which 
can  be  applied  to  solve  very  general  linear  systems.  In  the  main  theorem,  there  are 
no  restrictions  on  the  rank  of  the  matrices  A  or  B  and  it  can  be  applied  to  any  linear 


32 


system  as  long  as  it  has  a  solution.  In  addition,  with  this  generalized  capacitance 
matrix  theorem,  we  have  ample  freedom  to  choose  a  well-structured  matrix  A  for  de- 
signing more  efficient  capacitance  matrix  algorithms.  Unlike  in  [58],  no  construction 
of  auxiliary  rank-augmented  matrices  is  required  since  we  do  not  use  the  Sherman- 
Morrison- Woodbury  formula  in  our  generalization,  instead  our  resulting  algorithm 
uses  only  the  eigenvectors  in  null(AT)\null(BT)  to  project  the  RHSs  of  the  linear 
systems  with  matrix  A  onto  range(A).  To  achieve  the  projection  of  the  RHS  onto 
range(A),  we  subtract  the  RHS  along  the  vectors  Uj,  1  <  j  <  m^  —  m,  instead  of 
subtracting  along  the  eigenvectors  in  the  null  space  of  A  as  in  [58].  By  using  our 
projection  technique,  the  sparsity  of  the  RHS  can  be  preserved,  while  the  sparsity  of 
the  RHS  is  usually  destroyed  after  subtracting  along  the  eigenvectors  as  in  [58].  We 
take  advantage  of  this  sparsity  property  to  obtain  fast  numerical  algorithms  in  our 
work.  Unlike  in  [58],  the  capacitance  matrix  and  the  capacitance  matrix  equation 
are  explicitly  given  in  our  generalized  capacitance  theorems  and  the  algorithm.  By 
explicitly  specifying  the  capacitance  matrix  and  the  capacitance  matrix  equation,  it 
is  possible  to  modify  the  RHSs  of  these  equations  in  accordance  with  the  singularities 
of  A  and  B.  With  the  structure  of  the  capacitance  matrix  equation  given  explicitly, 
it  is  easy  to  design  an  efficient  numerical  algorithm  to  solve  this  equation.  When 
comparing  the  generalized  inverse  forms  of  (A  +  UV  )  proposed  in  [34]  with  our 
generalized  capacitance  matrix  theorems,  we  observe  that  the  generalized  inverse  for- 
mula of  Henderson  and  Searle  [34]  is  a  degenerate  result  of  our  theorem  2,  which  has 
a  more  relaxed  assumption,  i.e.  null(A)  C  nit//(B)  and  range(B)  C  range(A),  than 
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the  assumption,  range(B)  C  range(A)  for  symmetric  A  k  B,  made  in  [34].  When 
applied  to  solve  linear  systems  using  the  capacitance  matrix  approach  with  the  matri- 
ces A  and  B  satisfying  the  aforementioned  assumption,  both  the  generalized  inverse 
formula  in  [34]  and  our  theorem  2  lead  to  the  same  algorithm. 

We  will  state  and  prove  the  generalized  capacitance  matrix  theorems  for  different 
types  of  relationships  between  the  matrices  A  and  B  subsequently.  In  the  next 
chapter,  the  generalized  capacitance  matrix  algorithm  will  be  constructed  based  on 
the  theorems  presented  in  this  chapter. 

3.2     Generalized  Capacitance  Matrix  Theorems 

We  categorize  the  generalized  capacitance  matrix  theorems  according  to  the 
type  of  relationship  between  the  matrices  A  and  B.  The  first  theorem  is  for  the  case 
when  null(B)  C  null(A)  and  range(A)  C  range(B),  while  the  second  theorem  is 
for  the  assumption  null(A)  C  null(B)  and  range(B)  C  range(A).  There  are  no 
assumptions  on  A  and  B  in  the  third  theorem.  The  generalized  capacitance  matrix 
theorem  for  the  first  case  is  stated  as  follows. 

Theorem  1  Given  B  =  A  +  UGVT,  where  A,B  G  ftnXn,  G  <E  ftpXp  is  nonsingular, 
U  =  [Ul,...,up]  <E  ftnxp,  V  =  [Vl,...,vp]  e  &nxp,  with  both  the  sets  {Ul,...,up} 
and  {vi, . . . ,  vp}  containing  linearly  independent  vectors.  Assume  null(B)  C  null(A) 
and  range(A)  C  range(B).  Let  rank(B)  =  n  —  tub  and  rank(A)  =  n  —  m^, 
0  <  tub  <  rriA  <  n,  then  there  exist  two  sets  of  orthogonal  bases,  {qi, . . .  ,qmjl] 
and  {q\,...,q'm   },  for  null(AT)  and  null(A)  respectively,  such  that  null(B   )  = 
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span{  qmA-mB+i ,  . . . ,  qm„  }  and  null(B)  =  span{  q'mA.mB+1 , . . . ,  q'mA  }  •  Let  <£ui  = 
ai6(i,j),  a,  ^  0,  /or  1  <  »',j  <  m^  -  mB. 
Let  x  be  a  solution  to 


mA-7TlB      Q^h 

Ax  =  b-     £     iH«ii  (3-2) 


w/iere  b  €  range(B).  For  1  <  i  <  p,  let  tj,  be  a  solution  to 


mA-mB  qTu- 

At7,  =  Ui  -     V     -4— Uj,  (3.3) 


and  choose  rj1 . . .  rjm  _m     to  be  linearly  independent  and  the  linear  combinations  of 
q'x,...,q'm  _m    .  Define  the  capacitance  matrix  C  as 


mA-mB  n^U 

C  =  G-1+VT77-     Y.    G"S%-,  (3.4) 


where  tj  =  [rj1, . . .  ,r) ],  and  ej  is  the  j-th  standard  basis  vector  in  §?p.    Then,  there 
exists  a  unique  solution  (3  to  the  capacitance  matrix  equation 


C(3  =  \Tx-     £     Vg-1^'  (3-5) 

j=i     Ij  uj 


and  x  —  tj/3  is  a  solution  to  the  linear  system  Bx  =  b. 

We  will  use  the  following  proposition  to  prove  this  theorem. 
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Proposition  1  Under  the  assumptions  in  Theorem  1,  we  have  vf  q'_,  =  uf  q,  =  0  for 
1  <  i  <  p  and  mA  -  mB  +  1  <  ;  <  mA.  In  addition,  V^qV.-q'^-mJ  and 
UT[qi . . .  qmA-mB]  are  of  full  rank. 

Proof:  From  the  assumption  in  Theorem  1,  Bq'^  =  Aq'_,  =  0,  mA  -  mB  +  1  <  j  < 
mA.  Since  Bq',  =  (A  +  UGV7)^,  we  have  UGVTq'j  =  0  for  mA  -  mB  +  1  <  j  < 
mA.  From  the  assumption  that  U  €  3£nXp,p  <  n,  has  full  rank  and  G  is  nonsingular, 
it  is  obvious  that  V7^  =  0.  Therefore,  we  have  vfq'^  =  0  for  1  <  i  <  p  and 
mA  -  mB  +  1  <  j  <  mA.  Similarly,  we  can  show  that  ufq,  =  0,  1  <  *  <  p  and 
mA  -  mB  +  1  <  j  <  mA. 

We  use  proof  by  contradiction  to  show  that  the  matrix  VT[q'i . . .  q'mx_mJ  is  of 
full  rank.  Let's  assume  this  matrix  is  rank  deficient,  i.e.,  there  exists  a  nonzero  vector 
aT  =  (au  . . . ,  amA.mB)  such  that  VT[q\  . . .  q'myl_mJa  =  0.  Let  r  =  [q'x . . .  q'myl_mJa, 
then  r  G  null(A)  and  r  is  orthogonal  to  null(B).  Post-multiplying  both  sides  of  the 
equation  B  =  A  +  UGVT  by  r  yields  Br  =  Ar  +  UGVrr.  Since  Ar  =  0  and 
VTr  =  0,  we  obtain  Br  =  0.  But,  from  the  above,  r  is  a  nonzero  vector  orthogonal  to 
ntt//(B),  i.e.  Br  ^  0,  leading  to  a  contradiction.  Consequently,  the  assumption  that 
VT[q'1 . . .  q'myl_mB]  is  rank  deficient  is  not  true,  implying  that  VT[q'1 . . .  q'mA_m  J 
is  of  full  rank.  Similarly,  we  can  prove  that  UT[qi . . .  <\mA-mB]  is  of  full  rank.     ■ 

We  are  now  equipped  to  prove  Theorem  1.  The  proof  is  in  four  parts,  and 
we  will  show  that  (1)  there  exist  two  sets  of  orthogonal  bases,  {qi,...,qTOyJ  and 
{q'x, . . .  ,q'mA},  for  null(AT)  and  null(A)  respectively,  such  that  nu//(Br)  =  span{ 
qmA-mB+u  •••.qm,}  and  null(B)  =  span{  q'mx_mB+1, . . . ,  q'mJ,  (2)  there  exist 
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solutions  to  equations  3.2  and  3.3,  (3)  the  capacitance  matrix  C  is  nonsingular, 
implying  that  equation  3.5  has  a  unique  solution,  and  (4)  B(x  —  ?7/3)  =  b. 

Proof  for  Theorem  1: 
(1)  Since  rank(A)  =  n-mA  and  rank(B)  =  n-mB,  it  is  obvious  from  their  singular 
value  decompositions  that  there  exist  tua  orthogonal  eigenvectors,  <f>1 , . . . ,  <j>mA ,  for 
A  and  mB  orthogonal  eigenvectors,  Vi,  •  •  ',1>mBi  ^or  **  suc^  t*iat  nu^(A)  =  span{ 
4>x,...,(j>mA}  and  null(B)  =  span{^)1,...,^fmg}.  From  the  assumption  null(B) 
C  null(A),  we  can  write  each  V\->  i  =  1,  •  • .  ,m,  in  terms  of  a  linear  combination  of 
the  basis  vectors  <f>lt . . . ,  <t>mA-  Thus,  we  have 


/  \ 


\    ^mB    J 


=  T 


/  \ 

4>x 


\   ^rnA    J 


(3.6) 


where  T  G  $tmB*mA  is  of  full  rank.  By  using  the  Householder  or  Givens  transforma- 
tion [26],  we  can  obtain  the  factorization  of  T  as  follows: 


T  = 


0    T 


Q, 


(3.7) 
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where  0  is  an  mB  x  (mA  -  mB)  zero  matrix,  T  €   SRmsxms  is  nonsingular,  and 
Q  <=  sfcrnAxmA  js  orthogonal.  Define  the  vectors  q'x, . . . ,  q'mA  by  the  following  relation 


',</ 


=  Q 


V   ^rnA    J 


I  \ 

4>x 


\  <t>mA  ) 


(3.8) 


We  can  see  that  {q\, . . .  ,q'mA}  is  a  set  of  orthogonal  bases  for  null(A)  and  its 
subset  {q'mA_TOjrn,...,  q'mA}  forms  an  orthogonal  basis  for  nu//(B).  Thus,  the 
existence  of  the  basis  {q': , . . . ,  q'mA  }  which  satisfies  the  aforementioned  requirements 
is  proved.  Similarly,  we  can  prove  that  there  exists  an  orthogonal  basis  {qi, . . . ,  qmA} 
for  null(AT)  and  null(BT)  =  span{qmA-mB+i, . .  .,qmA}  is  satisfied. 

(2)  From  Proposition  1,  we  have  uf  q;  =  0  for  1  <  i  <  p  and  mA  -  mB  + 1  <  j  < 
mA.  The  assumption  b  G  range(B)  implies  bTq,  =  0  for  mA  -  mB  + 1  <  j  <  mA. 
Combining  these  facts  and  the  orthogonality  of  the  eigenvectors  q,,  j  =  1, . . . ,  mA,  we 
can  see  that  the  RHSs  of  equations  3.2  and  3.3  are  orthogonal  to  q,  for  j  =  1, . . . ,  mA, 
thus,  the  RHSs  are  orthogonal  to  null(AT).  Consequently,  the  RHSs  of  equations 
3.2  and  3.3  are  both  in  the  range  of  A,  which  means  there  always  exist  solutions  to 
equations  3.2  and  3.3. 

(3)  To  prove  that  C  is  nonsingular,  we  show  that  C/3  =  0  =>  /3  =  0.  Let's 
suppose  C/3  =  0,  then,  from  equation  3.4,  we  have 


mA—mB 


VTr,(3=     £    G 


,     qJU/3 


e,- 


j=i 


3       T 


-G~l(S. 


(3.9) 
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By  definition,  Bt//3  =  Arjf3  +  UVTT7/3.  Substituting  equations  3.3  and  3.9  into  this 
equation,  we  get  Brj/3  =  0.  Thus,  ?7/3  €  null(B).  Since  null(B)  C  null(A),  we 
have  T}(3  G  nu//(A),  i.e.  Atj/3  =  0.  Substituting  A?7  by  equation  3.3  and  denoting 
(3  =  (/?i,..., /?p),  we  obtain 


P  P  mA-m.B  aTii- 

,=1  i=l  J=l        H,UJ 


(3.10) 


After  rearranging,  equation  3.10  becomes 


mA-mB  a?\J3  P 

j=l  "j  UJ  j=m^-mfl+l 


(3.11) 


Since  ui, . . . ,  Up  are  linearly  independent,  equation  3.11  imposes  the  following  condi- 
tions on  /?j 


ft    = 


qJU/3 


-,     j  =  l,...,mA  -mB, 


ft    =    0,     j  =  mA-mB  +  l,...,p- 


(3.12) 
(3.13) 


Substituting  these  conditions  into  the  RHS  of  equation  3.9,  we  obtain 


/  \ 


»      [»7j,  •  •  •  ,f7m,i-mBJ 


PmA-r 


0. 


B    / 


(3.14) 
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For  i  =  l,...,mA  -  mB,  the  RHS  in  equation  3.3  equals  0,  thus,  »7,  G  null(A). 
From  the  assumption  that  'Hl,...,rirnA_mB  are  chosen  to  be  linear  combinations  of 
q'x, . . . ,  q'm  _m  and  linearly  independent,  we  can  write  [tjx  ...  T7mA_mfl]  in  terms 
of  [q'i     •  •  •    q'm^-m  J  ^  follows: 


[»7l      •  •  •     *1mA-mB]  =  [q'l      •  •  •     q'myl-mB]T' 


(3.15) 


where  T  G  $(mA-mB)x(mA-mB)  \s  nonsingular.    By  substituting  equation  3.15  into 


equation  3.14,  we  get 


VT[q\    ...    q'm._mjT 


y   PmA-mB    J 


=  0. 


(3.16) 


From  Proposition  1,  the  matrix  V^q'j     . . .     q'm/1_mfl]  is  of  full  rank  and  is  of  size 
p  x  (rriA  —  tub)  with  p  >  tua  —  ™-b,  which  implies 


/  \ 

Pi 


PmA- 


mA-mB 


=  0. 


(3.17) 


/ 


Since  T  is  nonsingular,  this  makes  /?,•  =  0,  t  =  1, . . .  ,mA  —  tub-  Combining  this  with 
equation  3.13,  we  obtain  (3  =  0.  Thus  concluding  that  C/3  =  0  implies  /3  =  0, 
therefore,  C  is  nonsingular. 
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(4)  Now,  we  will  show  that  B(x  -  i](3)  =  b.   Substituting  A  +  UGVT  for  B 
and  expanding,  we  have 

B(x  -  Tj/3)  =  Ax  -  Ari/3  +  UGVTx  -  UGVTT7/3.  (3.18) 

Rewriting  VTT7/3  using  equations  3.4  and  3.5  and  some  algebraic  manipulation,  we 
get 


mA-mB       Tl  mA-mB  Tjj 

VTV(3  =  VTZ-     £     V^G-iet  -  G-*/3  +     £    .A-fi.  (3.19) 


Substituting  for  Ax,  At7  and  VT77/3  from  equations  3.2,  3.3  and  3.19  respectively, 
equation  3.18  leads  to  B(x  -  tj(3)  =  b.     ■ 

In  theorem  1,  we  assume  that  the  matrices  U  and  V  contain  linearly  independent 
columns  and  G  is  nonsingular,  which  is  equivalent  to  saying  that  U,  V  and  G  are  all 
of  full  rank,  in  addition,  the  condition  qfu,  =  oti6(i,j)  must  be  satisfied  for  1  <  i,  j  < 
ruA—ms-  In  fact,  we  can  prove  that  given  any  two  matrices  A,  B  €  3£nXn,  there  always 
exist  full-ranked  matrices  U,  V  €  %nXp  and  G  €  ftpxp  such  that  UGVr  =  B- A  and 
p  =  rank(B  —  A).  This  can  be  verified  by  taking  the  singular  value  decomposition  of 
the  matrix  B  —  A  as  U27VT,  where  both  tj  and  V  are  orthogonal  matrices  and  U  is 
a  diagonal  matrix  with  rank  p  =  rank(B  —  A).  We  can  reduce  the  size  of  the  matrix 
£  by  retaining  the  p  nonzero  diagonal  entries  and  taking  out  the  remaining  rows 
and  columns  of  zeros.  This  reduced  matrix  G  G  3?pxp  is  nonsingular.  In  addition, 
we  take  out  the  columns  in  U  and  V  corresponding  to  the  zero  singular  values  in  U, 
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then  the  reduced  matrices  U,  V  €  3JnXp  are  obtained.  It  is  not  difficult  to  see  that 
UGVr  =  B  —  A  and  the  matrices  U,  V  and  G  are  of  full  rank. 

As  for  the  condition  qfu,  =  cti6(i,j),  1  <  i,j  <  mA  -  tub,  we  can  find  such 
a  U  by  transformation.  If  {U,G,V}  satisfies  UGVT  =  B  -  A  and  each  is  of  full 
rank,  then  the  alternative  set  {UQ,  Q_1G,  V}  also  satisfies  these  requirements  when 
the  matrix  Q  €  9£pXp  is  nonsingular.  From  Proposition  1,  Ur[qi . .  .qm>4_mB]  has 
full  rank.  We  can  find  a  transformation  matrix  Q  to  make  the  upper  (mA  -  mg)  x 
(m,A  —  tub)  block  of  the  transformed  matrix  QTUT[qi . . .  qm/1_mB]  diagonal.  Then, 
the  condition  qfu_,  =  a,<5(i,  j),  1  <  *,  j  <  rriA  -  tub,  is  satisfied. 

There  are  infinite  number  of  choices  for  the  matrices  U,  V  and  G  to  satisfy  the 
aforementioned  requirements.  The  above  choice  from  the  singular  value  decomposi- 
tion of  B  —  A  is  just  a  particular  one  that  leads  to  a  diagonal  G  matrix.  This  makes 
the  computation  of  G_1  in  equation  3.4  trivial.  The  choice  of  A,U,  V  and  G  is  very 
crucial  for  the  efficiency  of  the  numerical  algorithms  based  on  the  capacitance  matrix 
algorithm  to  solve  a  linear  system. 

Now  let's  turn  to  the  other  case  when  null(A)  C  null(B)  and  range(B)  C 
range(A).  Note  that  these  two  constraints  are  equivalent  to  each  other  when  both 
B  and  A  are  symmetric  matrices  as  in  [34]. 

Theorem  2  Given  B  =  A  +  UGVT,  where  A,B  €  ftnxn,  G  €  3Rpxp  is  nonsingular, 
U  =  [Ul,...,up]  €  »nxp,  V  =  [vx,...,vp]  €  Snxp,  with  both  the  sets  {ux,...,up} 
and  {vi, . . . ,  vp}  containing  linearly  independent  vectors.  Assume  null(A)  C  nit//(B) 
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and  range(B)  C  range(A).  Let  x  be  a  solution  to 


Ax  =  b,  (3.20) 

where  b  G  range(B).  For  1  <  i  <  p,  let  tj{  be  a  solution  to 

At/,  =  u„  (3.21) 

Define  the  capacitance  matrix  C  as  G_1  +  Vtt?,  where  rj  =  [tj1,  . .  .,rjp].  Then  there 
exist  a  solution  (3  to  the  capacitance  matrix  equation 

C/3  =  VT5c,  (3.22) 

and  x  —  tj/3  is  a  solution  to  the  linear  system  Bx  =  b. 

We  will  use  the  following  proposition  to  prove  the  above  theorem. 

Proposition  2  In  Theorem  2,  let  {qi, . . . ,  qmA}  and  {q'x, . . . ,  q'myl}  be  the  sets  of  or- 
thogonal eigenvectors  in  null(AT)  and  null(A)  respectively,  where  tha  =  n-rank(A). 
Then,  qj u,  =  q'(  Vj  =  0,  for  i  =  1, . . . ,  tua  and  j  =  1, . . .  ,p. 

Proof:  From  the  assumption  that  null(A)  C  nu//(B),  we  can  see  that  the  eigen- 
vectors q',-,  i  =  1, . . .  ,771,4,  are  also  in  null(B).  Since  Bq',-  =  Aq',  +  UGV  q'i?  we 
obtain 

UGVTq',  =  0,  (3.23) 
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for  1  <  i  <  mA.  The  matrix  U  is  of  size  n  x  p,  n  >  p,  and  of  full  rank,  consequently 
GVTq';  =  0.  From  the  assumption  that  G  is  nonsingular,  we  have  VTq',  =  0. 
Therefore,  q'J v,  =  0  for  i  =  1, . . . ,  mA  and  j  =  1, . . .  ,p. 

Similarly,  we  obtain  the  equation  VGTUTqt  =  0  from  the  condition  BTq,  = 
ATq,  =  0  which  is  derived  from  the  assumption  range(B)  C  range(A).  Using 
the  similar  reasoning  for  this  equation  leads  to  the  conclusion  that  q-  Uj  =  0,  i  = 

l,...,mxandj  =  l,...,p.    ■ 

We  are  now  poised  to  prove  Theorem  2. 

Proof  for  Theorem  2  :  The  proof  is  presented  in  three  parts  namely,  (1)  we 
will  show  that  there  exist  solutions  for  equations  3.20  and  3.21,  (2)  we  will  establish 
that  there  exist  a  solution  for  equation  3.22,  and  (3)  show  that  B(x  —  tj/3)  =  b. 

(1)  To  show  that  there  exist  a  solution  for  equations  3.20  and  3.21,  we  prove  that 
the  RHSs  of  both  equations  are  in  the  range  of  A.  It  is  obvious  that  b  €  range(A) 
for  equation  3.20,  since  b  €  range(B)  and  range(B)  C  range(A).  For  equation  3.21, 
from  Proposition  2  we  can  see  that  u,-,  1  <  i  <  p,  is  orthogonal  to  null(AT),  which 
means  ut-  €  range(A)  for  i  —  1,. . .  ,p. 

(2)  As  for  equation  3.22,  we  need  to  show  that  VTx  is  in  the  range  of  C.  We 
use  proof  by  contradiction  in  the  following.  Assume  that  VTx  is  not  in  the  range  of 
C,  then  there  exists  a  nonzero  vector  7  €  nit//(CT),  i.e.  CT7  =  0,  such  that 

7T(VTx)  ^  0.  (3.24) 
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Using  the  generalized  inverse  l  A+  of  the  matrix  A,  we  can  write  x  as  follows. 


mA 


x  =  A+b  +  ^a.q'i, 


(3.25) 


i=i 


where  mA  =  n  -  rank(A),  q'x, . . . ,  q'mA  are  the  orthogonal  eigenvectors  in  null  (A), 
and  a,  €  3ft,  Vi.  Then, 

Vrx  =  VTA+b  +  £  a,VTqV  (3-26) 


«=i 


From  Proposition  2,  we  have  VTq',  =  0,  for  i  =  1, . . . ,  mA.  Substituting  into  equation 
3.26,  we  have  Vrx  =  VTA+b.  Substituting  for  VTx  in  equation  3.24  yields 


■T\r-.\T 


(A+iV7)Tb^0. 


(3.27) 


We  can  show  that  A+TV7  G  null(BT)  as  follows. 

BT(A+TV7)    =    (A  +  UGVT)TA+TV7  (3.28) 

=    (ATA+TV  +  VGTUTA+TV)7  (3.29) 

Since  ATA+T  =  /  -  ££1  q'.-q'f ,  using  a  result  of  Proposition  2  namely,  q'f  V  =  0 
for  i  =  1, . . . ,  rriA,  we  have 


TA+ri 


™a 


jT, 


ATA+i  v  =  v  -  22  q'^q'. v  =  v- 


(3.30) 


xLet  the  singular  value  decomposition  of  A  be  A  =  US\T ,  where  U  and  V  are  orthogonal 
matrices  and  S  =  diag(0, . . .  ,0,XmA+i, . .  .,\n).  Then  its  generalized  inverse  A+  =  VJE+UT, 
where  27+  =  diag(0, ...,  0,  j^, . . .,  £)  [26]. 
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Substituting  equation  3.30  into  equation  3.29  yields 

BT(A+TV7)  =  VGT(G-1  +  VTA+U)T7  (3-31) 

Again,  using  Proposition  2,  we  can  show  that  C  =  G-1  +  VTA+U.  Consequently, 
we  obtain  BT(A+TV7)  =  VGrCT7  =  0  since  CT7  =  0  by  assumption.  Thus, 
A+TV7  e  null(BT).  Combining  this  with  the  equation  3.27  leads  to  the  conclusion 
that  b  is  not  in  range(B).  This  is  a  contradiction  to  the  assumption  b  €  range(B). 
Therefore,  the  assumption  that  VTx  is  not  in  the  range  of  C  is  untrue.  Hence,  we 
can  conclude  that  Vrx  G  C.  This  proves  that  there  exists  a  solution  to  equation 
3.22. 

(3)  The  proof  for  B(x  -  rj(3)  —  b  is  very  similar  to  part  (3)  of  the  proof  for 
Theorem  1  and  is  omitted  here.     ■ 

If  we  restrict  the  above  constraints  on  the  relation  between  B  and  A  to  null(B)  = 
null(A)  and  range{B)  —  range(A),  then  the  capacitance  matrix  C  is  nonsingular, 
i.e.,  the  capacitance  matrix  equation  has  a  unique  solution.  This  result  is  stated  in 
the  following  corollary. 

Corollary  1  In  Theorem  2,  if  null(A)  =  null(B)  and  range(A)  —  range(B),  then 
the  capacitance  matrix  C  is  nonsingular. 

Proof:  To  prove  that  C  is  nonsingular,  we  show  that  C(3  =  0  =>  (3  —  0.  Suppose 
C/3  =  0,  then,  from  the  definition  of  C,  we  have  VT?7/3  =  —  G-1/3,  which  can  be 
substituted  into  the  equation  Bt//3  =  A?7/3  +  UVT»7/3,  along  with  the  equation  for 
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A*7  from  (3.21),  to  get  Btj(3  =  0,  i.e.,  tj/3  €  nu//(B).  Since  null(A)  =  null(B),  we 
have  Arj/3  =  0.  Using  equation  3.21  for  Arj,  we  get  V(3  —  0.  From  the  assumption  on 
the  matrix  U  namely,  Uj, . . . ,  up  are  linearly  independent,  we  conclude  that  /3  =  0. 
Therefore,  C  is  nonsingular.     ■ 

In  the  above  theorems,  we  impose  the  constraints  on  the  relationship  between 
B  and  A  either  via  null(B)  C  null(A)  and  range(A)  C  range(B)  or  via  null(A)  C 
null(B)  and  range(B)  C  range(A).  Now  we  consider  the  general  case,  i.e.,  no  con- 
straint on  the  relationship  between  B  and  A  is  imposed.  The  following  theorem  is 
given  for  this  general  case  and  can  be  used  for  any  matrices  B  and  A  in  3£™xn  as  long 
as  there  exists  a  solution  for  the  associated  capacitance  matrix  equation. 

Theorem  3  Given  B  =  A  +  UGVT,  where  B,  A  €  &nxn,  G  G  ftpxp  is  nonsingular, 
U  =  [ux,...,up]  e  %nxp,  V  =  [vj,...,Vp]  €  3JnXp,  with  both  the  sets  {u!,...,up} 
and  {vi,...,vp}  containing  linearly  independent  vectors.  Let  rank(A)  =  n  —  m^ 
dim(null(B)  n  null(A))  =  m',  and  dim(null(BT)  C\  null(AT))  =  m,  0  <  m',m  < 
ttia  <  n,  then  there  exist  two  sets  of  orthogonal  basis,  {q!,...,qm^}  and  {q'1? . . .  ,q'm^}, 
fornull(AT)  and  null(A)  respectively,  such  that  null(BT)C\null(AT)  =  span{qmA-m+i, 
...,qmA}  and  null(B)  D  null(A)  =  span{q'mA_m,+1,. . .  ,q'mA}.  Let  rank(B)  = 
n  —  tub,  0  <  m',m  <  tub  <  n,  then  there  exist  tub  —  rn'  vectors  r'm<+i , . . . ,  r'TOB 
such  that  the  set  {q'mA_m,+1, . . .  ,q'mx,r'm«+i, . . .  ,r'mB}  is  an  orthogonal  basis  for 
null(B).  Let  qfuj  =  a,<5(i,  j),  a^  ^  0,  for  1  <  i,j  <  tua  —  m. 
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Let  x  be  a  solution  to 


mA-m    aTu 

Ax  =  b-    £    ^f-uif  (3.32) 


where  b  €  ran^e(B).  For  1  <  i  <  p,  let  ?jt  be  a  solution  to 


Aifc  =  u,  -    2    ^_Lui'  (3-33) 

tj  qJui 


and  choose  rj^  . . .  »7mx_m  £o  6e  linearly  independent  and  linear  combinations  o/q'x, 
q'm  _m.  Define  the  capacitance  matrix  C  as 


C^G^+V^-    £   G"1^^,  (3.34) 

j=i  %  ui 


w/iere  77  =  [ift, . . .  ,  T7P],  and  ej  is  the  j-th  standard  basis  vector  in  W.  Assume  the 
two  sets  of  vectors  {Ar'm/+i, . . . ,  Ar'mB}  and  {ul5 . . . ,  up}  are  linearly  independent, 
then  there  exists  a  unique  solution  /3  to  the  capacitance  matrix  equation 


C/3  =  VTx-    T    -^r—G-'ej,  (3.35) 


and  x  —  tff3  is  a  solution  to  the  linear  system  Bx  =  b. 

The  proof  of  this  theorem  is  omitted  here  since,  it  is  similar  to  the  proof  of 
Theorem  1.  This  proof  will  require  the  use  of  the  fact  that  qf  Uj  =  0,  for  tua  —  m  + 1  < 
i  <  m,A  and  1  <  j  <  p,  and  the  matrix  VT[q'1  . . .  q'm     m\  is  of  full  rank. 
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In  general,  Theorems  1  and  2  can  be  regarded  as  the  special  cases  for  Theo- 
rem 3  with  the  exception  of  the  additional  assumption  on  the  linear  independence  of 
the  two  sets  of  vectors  {Ar'm,+i, . . . ,  Ar'ms}  and  {ux, . . . ,  up}  made  in  Theorem  3  to 
guarantee  the  uniqueness  of  the  solution  in  the  capacitance  matrix  equation.  This  as- 
sumption is  automatically  satisfied  for  Theorem  1  since  the  set  {Ar'm/+i,. . . ,  Ar'mfl} 
is  empty  and  the  vectors  ux, . . . ,  up  are  linearly  independent.  However,  there  is  no 
such  assumption  in  Theorem  2,  which  gives  the  existence  of  a  solution  to  the  capac- 
itance matrix  equation  instead  of  the  uniqueness  of  the  solution  given  in  Theorem 
3. 

3.3     Discussion 

In  this  chapter,  we  presented  the  generalized  capacitance  matrix  theorems  for 
solving  linear  systems.  Our  generalized  capacitance  matrix  theorems  can  be  used  for 
very  general  matrices  A  and  B,  therefore  the  resulting  algorithm  can  be  applied  to  a 
very  broad  class  of  problems  and  we  have  ample  freedom  to  choose  the  matrix  A  ap- 
propriately for  designing  efficient  numerical  algorithms.  The  traditional  capacitance 
matrix  algorithm  [10,  65,  66]  has  been  primarily  applied  to  solve  the  second-order 
elliptic  boundary  value  problems,  whereas  our  generalized  capacitance  matrix  algo- 
rithm can  be  used  to  solve  any-order  elliptic  boundary  value  problems.  Also,  we  can 
handle  cases  such  as  the  shape  from  shading  (  SFS  )  problem  on  an  irregular  domain 
with  Neumann  boundary  conditions  which  specifically  yields  a  singular  K  matrix. 
This  case  can  not  be  solved  using  the  capacitance  matrix  technique  of  Buzbee  et  al., 
[10]  as  mentioned  in  [72]  because,  the  capacitance  matrix  theorem  of  [10]  requires  a 
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nonsingular  K.   Our  algorithm  gives  a  solution  to  this  problem  (recovers  a  surface) 
within  a  scale  factor  in  height. 

Previous  works  [11,  34,  67]  on  deriving  the  generalized  inverse  forms  of  (A  + 
UVT)  were  presented  under  restrictive  assumptions.  Although  our  generalized  capac- 
itance matrix  theorems  are  developed  for  solving  general  linear  systems,  it  is  possible 
to  derive  the  generalized  inverse  form  of  (A  +  UGVT)  for  more  general  cases  based 
on  these  theorems. 


CHAPTER  4 
GENERALIZED  CAPACITANCE  MATRIX  ALGORITHM 


4.1     Structure  of  the  Algorithm 

From  the  generalized  capacitance  matrix  theorems  presented  in  chapter  2,  we 
can  generalize  the  standard  capacitance  matrix  algorithm  for  general  matrices  A  and 
B  as  follows. 

1.  Solve  equation  3.32  for  x. 

2.  Solve  equation  3.33  for  ifc,  for  *  =  1, . . .  ,p. 

3.  Form  the  capacitance  matrix  C  from  equation  3.34. 

4.  Solve  the  capacitance  matrix  equation  (i.e.  eq.  3.35)  for  /3. 

5.  The  solution  x  =  x  —  rj/3. 

This  generalized  capacitance  matrix  algorithm  can  always  be  applied  to  the  cases 
when  nu//(B)  C  null(A)  and  range(A)  C  range(B)  or  when  null(A)  C  nu//(B) 
and  range(B)  C  range(A),  as  stated  in  Theorem  1  and  2  respectively.  However, 
to  apply  this  algorithm  to  more  general  cases,  we  need  to  make  sure  that  a  solution 
exists  to  the  capacitance  matrix  equation  in  step  4.  Theorem  3  gives  a  sufficient 
condition  for  the  uniqueness  of  the  solution  to  the  capacitance  matrix  equation  in 
the  general  case. 
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Note  that  the  step  2  of  the  algorithm  requires  the  solutions  ifc, . . . ,  Tjp  in  equation 
3.33  with  different  RHSs.  The  computational  cost  to  solve  this  equation  p  times  is 
very  high  when  p  is  large.  However,  the  vectors  ■ql,...,'qp  can  be  easily  obtained 
when  the  matrix  A  is  chosen  to  be  circulant  Toeplitz  and  the  vectors  u,  have  a  very 
sparse  set  of  non-zero  entries.  The  circulant  Toeplitz  property  of  the  matrix  A  makes 
it  shift-invariant,  which  means  A5(u)  =  S(Au)  for  all  u  €  3ftn  where  S  is  a  shift 
operator  [59].  A  linear  shift-invariant  operator  can  be  completely  characterized  by  its 
impulse  response,  which  is  the  solution  of  the  corresponding  linear  system  Au  =  e 
where  the  RHS  e  is  a  unit  vector.  Thus,  the  solution  ?7,  can  be  obtained  via  the 
convolution  of  the  impulse  response  for  this  shift-invariant  linear  system  with  the 
RHS  in  equation  3.33,  which  is  sparse  since  the  vectors  Ui,...,up  themselves  are 
sparse.  The  convolution  of  the  impulse  response  with  a  sparse  vector  can  be  easily 
obtained  by  a  linear  combination  of  the  shifted  impulse  response  vectors.  When  the 
vectors  v,,  i  =  l,...,p,  are  also  sparse,  the  computation  of  the  matrix  V  tj  for 
forming  the  capacitance  matrix  C  can  be  further  simplified  by  obtaining  the  (i,j)-th 
entry  of  Vttj  from  the  impulse  response  corresponding  to  the  relative  locations  of 
the  nonzero  components  of  V;  and  those  of  Ui, . . . ,  umA_m  and  u,-. 

The  direct  computation  of  rj(3  in  the  final  step  of  the  generalized  capacitance 
matrix  algorithm  requires  0(pn)  operations,  which  are  very  costly  when  p,  the  rank 
of  difference  between  B  and  A,  is  large.  Instead,  we  solve  the  following  linear  system 


Ay  =  U/J-    £    5L*  (4.1) 
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to  get  the  solution  y  which  has  the  same  projection  on  the  range  of  A  as  rj(3.  As  for 
the  projection  of  rjf3  on  the  null  space  of  A,  it  can  be  easily  computed  from  the  inner 
product  of  /3  and  a,  =  (aM, . . .  ,a,lP),  where  a,j  =  qflj,,  for  1  <  i  <  mA  -  m.  The 
coefficients  a;  can  be  chosen  arbitrarily  as  long  as  the  assumption  that  ^  .  ..rjmA_m 
are  linearly  independent  and  linear  combinations  of  q'1? . . . ,  q'm/1_m  is  satisfied.  Thus, 
rjf3  can  be  obtained  from  the  following  equation. 


mA-m  T 


The  flowchart  of  the  generalized  capacitance  matrix  algorithm  is  given  in  figure 
4.1. 

4.2     Choice  of  the  Splitting 

From  the  variational  formulations  for  the  early  vision  problems,  we  can  see  it 
is  necessary  to  solve  linear  systems  of  the  form  Kx  =  b,  where  the  matrix  K  £ 
$NxN(N  -  n2)  is  the  sum  of  the  matrices  Kd  and  AK,  obtained  from  the  data  con- 
straint and  smoothness  constraint  respectively.  To  apply  the  generalized  capacitance 
matrix  algorithm  to  solve  the  linear  systems,  we  have  to  choose  a  splitting  scheme 
for  K  at  first.  We  discuss  the  nice  choice  of  splitting  K  in  this  section. 

Assume  we  split  the  matrix  K  into  two  components  K0  and  UV  .  Note  that  the 
matrices  K  and  K0  can  be  replaced  by  the  matrices  B  and  A,  respectively,  used  in 
chapter  3.  The  notation  K  and  K0  (popularly  used  in  computational  vision  literature) 
are  used  instead  of  B  and  A  (widely  used  in  the  field  of  capacitance  matrix  methods) 
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Input  data 


Split  B  into  A  +  UGV 


1 


Solve  eq.(3.32)  for  x 


Solve  eq.(3.33)  forrij 


Form  the  capacitance 
matrix  C  from  eq.(3.34) 


Solve  capacitance  matrix 
eq.  in  eq.  (3.35)  for  P 


J 


Solve  eq.(4.1)  for  y  & 
compute  T|P  from  eq.(4.2) 


I  Tip 


Compute  x  -  tjP 


Figure  4.1.  The  flow  chart  of  the  generalized  capacitance  matrix  algorithm. 

for  the  rest  of  this  chapter.  Using  the  generalized  capacitance  matrix  algorithm  to 
solve  the  linear  system  Kx  =  b  with  the  above  splitting  for  K,  we  need  to  solve  several 
linear  systems  of  the  form  K0x  =  b  as  shown  in  equations  3.2  and  3.3  respectively. 
Especially,  in  equation  3.3,  we  have  to  solve  it  p  times,  where  p  is  the  rank  of  K  —  K0. 
Therefore,  it  is  very  crucial  to  choose  a  well  structured  K0  so  as  to  efficiently  compute 
the  solution  in  the  capacitance  matrix  method. 

In  general,  there  are  three  criteria  that  dictate  the  choice  of  Ko.  The  first  is 
the  difference  between  K0  and  K  must  be  small  in  the  sense  that  rank(K  —  K0)  is 
small.  Secondly,  Ko  must  possess  some  nice  structure  so  as  to  have  a  fast  numerical 
solution  to  the  associated  linear  system.  Lastly,  Kq  should  be  translation  invariant; 
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this  property  has  the  advantage  of  saving  the  computational  cost  in  solving  equation 
3.3  for  each  u,,  0  <  i  <  p  -  1,  since  there  are  only  one  or  two  nonzero  components  in 
each  u,.  Therefore,  we  need  to  compute  the  solution  h0  to  a  linear  system  consisting 
of  a  matrix  K0  and  an  RHS  vector  which  is  the  projection  of  the  unit  vector  onto 
the  range  of  K0.  Because,  h0  is  translation  invariant,  we  can  obtain  the  solution  of 
r)i  in  equation  3.3  without  having  to  solve  any  of  the  linear  systems  associated  with 

each  TJi- 

For  ease  of  exposition,  we  consider  the  splitting  scheme  for  the  surface  recon- 
struction problem  in  the  following.  This  splitting  schemes  can  be  extended  to  other 
early  vision  problems. 

The  stabilizer  discussed  in  the  previous  section  imposes  smoothness  constraints 
on  the  surface  reconstruction  problem  leading  to  a  C°  surface  i.e.,  the  membrane 
spline,  a  C1  surface  namely,  the  thin  plate  spline,  or  a  combination  thereof  called 
the  thin-plate-membrane  spline.  As  mentioned  earlier,  the  numerical  solution  to  the 
surface  reconstruction  problem  involves  solving  a  linear  system  Kx  =  b.  The  matrix 
K  is  large  and  sparse  but  not  tightly  banded.  This  is  primarily  due  to  the  fact 
that  a  vector  has  been  used  to  represent  a  two-dimensional  array  which  leads  to 
representing  neighboring  interactions  in  two-dimension  by  interactions  between  very 
distant  elements  in  one-dimension.  The  matrix  K  is  a  sum  of  two  matrices  Ka  and 
Kd,  with  Ks  containing  the  membrane  or  thin-plate  molecules  [81]  defined  on  the 
interior  and  boundary  of  0.  The  K^  matrix  is  a  diagonal  matrix  containing  non-zero 
weights. 
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By  making  use  of  the  structure  of  the  matrix  K,  we  can  split  K  into  two  com- 
ponents K0  and  UVT,  with  K0  €  $tN*N  and  U,  V  €  &Nxp,  such  that  K0  is  close  to 
K,  i.e.,  the  rank  of  UVr  is  small,  and  there  exists  a  fast  algorithm  to  solve  a  linear 
system  with  the  matrix  K0.  Then,  the  solution  to  the  linear  system  Kx  =  b  can 
be  obtained  by  using  the  generalized  capacitance  matrix  algorithm  which  involves 
solving  the  well-structured  linear  systems  with  the  matrix  K0  and  the  capacitance 
matrix  linear  system. 

The  choice  of  K0  depends  on  the  data  compatibility  matrix  Kj  and  the  smooth- 
ness constraint  matrix  Ks.  We  have  different  choices  of  K0  for  sparse  (scattered)  and 
dense  data  cases.  They  are  given  in  the  following  sections. 

4.2.1     Dense  Data  Surface  Reconstruction 

We  will  first  describe  our  choice  of  the  matrix  K0  and  the  equivalent  Lyapunov 
matrix  equation  for  the  dense  data  surface  reconstruction  with  a  membrane  spline. 
Then  we  will  present  the  matrix  K0  for  the  thin-plate-membrane  spline  surface  re- 
construction based  on  the  construction  scheme  for  the  membrane  problem.  The  thin- 
plate  spline  is  a  special  case  of  thin-plate-membrane  spline  and  will  not  be  discussed 
separately.  In  addition,  the  selection  of  the  matrices  U  and  V  will  be  discussed  for 
both  cases. 

Membrane  spline  for  the  dense  data  case 

The  data  compatibility  matrix  K^  for  the  dense  data  case  with  uniform  weight- 
ing is  an  identity  matrix.  Focusing  on  the  surface  reconstruction  problem  in  a  rectan- 
gular domain,  the  matrix  Ko  is  chosen  to  contain  both  the  membrane  computational 
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molecules  and  the  data  constraint  molecules  [81]  all  over  the  domain  ,  which  are 
shown  in  figure  4.2,  Using  this  choice  of  K0,  we  can  transform  the  original  linear 
system  Kx  =  b  to  an  equivalent  Lyapunov  matrix  equation. 


(a) 


(b) 


Figure  4.2.  (a)  Data  constraint  molecule  and  (b)  membrane  computational  molecules 
periodically  imposed  on  the  domain  fi. 


With  the  above  choice,  the  matrix  K0  can  be  written  as  follows. 


Kq  —  AKmeTO  +  Krf  —  AKTOem  +  I? 


(4.3) 


where 


Kr 


D    -I 
-I       . 


-I 


-I 


.    -I 
I     D 


(4.4) 
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4    -1 


-1       4    -1 


-1 


D    = 


€»" 


(4.5) 


-1 


-1       4    -1 


-1       4 


The  matrix  Kmem  can  be  regarded  as  the  result  obtained  from  the  5-point  ap- 
proximation to  Laplacian  operator  on  a  doubly  periodic  imbedded  rectangular  do- 
main. From  a  computational  molecule  [81]  point  of  view,  only  the  membrane  molecule 
is  used  at  every  node  of  the  doubly  periodic  rectangular  domain.  Therefore  the  molec- 
ular inhibitions  due  to  depth  discontinuities  (if  any)  and  imposed  doubly  periodic 
smoothness  are  included  in  the  matrix  UVT.  Given  the  matrices  K  and  K0,  we  can 
show  that  the  choice  of  the  matrices  U  and  V  such  that  UVT  =  K  -  K0  where  both 
matrices  U  and  V  are  of  full  column  rank  is  not  unique.  Our  selection  method  for 
the  matrices  U  (=  [ul5 . . . ,  up])  and  V  (=  [vl5 . . . ,  vp])  is  as  follows.  Let  each  column 
vector  U,-,  1  <  *  <  p,  contain  only  two  nonzero  entries  +1  and  —1  located  at  the  two 
neighboring  sites  separated  by  the  i-th  discontinuity.  Then  select  the  column  vector 
V;  to  be  -Au,,  1  <  »  <  p.  Thus  the  matrices  U  and  V  are  of  full  column  rank  p 
where  p  is  the  number  of  discontinuities.  Our  selection  scheme  has  the  advantage 
that  the  column  vectors  of  U  and  V  are  very  sparse  (only  two  nonzero  entries  in  each 
column),  a  structure  that  is  very  conductive  to  the  design  of  an  efficient  numerical 
algorithm.  We  will  discuss  more  of  this  in  section  4.3. 


58 


We  can  see  that  this  particular  choice  of  K0  makes  it  circulant  Toeplitz  and  thus 
translation  invariant.  In  addition,  it  can  be  decomposed  as  the  sum  of  two  tensor 
products  of  matrices  A  and  I,  i.e., 


K0  =  A(g>I  +  I<g>A, 


(4.6) 


with 


A  = 


i  +  2A         -A 

-A 

-A    |  +  2A 

-A 

*'• 

''• 

-A    |+2A         -A 

-A 

-A    \  +  2X 

€  »nxn, 


and  ®  is  the  Kronecker  product.  By  using  this  special  structure  of  the  matrix  K0,  we 
can  rewrite  any  linear  system  of  the  form  K0z  =  f  as  the  following  Lyapunov  matrix 


equation 


AZ  +  ZA  =  F, 


(4.7) 


where  Z  and  Farenxn  matrices  corresponding  to  their  concatenated  n2  x  1  vectors 
z  and  f  respectively.  It  should  be  noted  that  n2  =  N,  and  that  the  matrix  A  in 
equation  4.7  is  circulant  Toeplitz  and  symmetric  positive-definite.  We  can  use  the 
ADI  (Alternating  Direction  Implicit)  method  to  solve  this  Lyapunov  matrix  equation 
in  0(N)  operations.  We  will  prove  this  0(N)  computational  complexity  of  the  ADI 
method  in  section  4.3. 
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Thin-plate-membrane  spline  for  the  dense  data  case 

The  construction  of  the  matrix  K0  for  the  thin-plate-membrane  spline  prob- 
lem is  based  on  the  above  construction  for  the  membrane  spline.  Analogous  to  the 
matrix  Kmem  which  is  a  well-structured  matrix  and  close  to  the  smoothness  con- 
straint matrix  Ks  constructed  for  the  membrane  problem,  a  matrix  Ktp  is  used  for 
the  smoothness  constraint  matrix  discretized  from  the  thin-plate  smoothness  energy. 
The  differentiation  of  the  membrane  smoothness  energy  function  corresponds  to  a 
Laplacian  operator  A,  while  the  corresponding  operator  for  the  thin-plate  smooth- 
ness energy  is  a  biharmonic  operator  A2.  Thus,  we  can  extend  the  selection  scheme 
for  the  membrane  problem  to  the  thin-plate  problem  by  choose  the  matrix  Ktp  as 

KtP  =  KLm-  (4-8) 

The  smoothness  matrix  for  the  thin-plate-membrane  spline  is  a  convex  combination 
of  the  membrane  and  thin-plate  smoothness  matrices.  Let's  define  the  matrix  Kjpm 
as  a  convex  combination  of  Kmem  and  Ktp,  i.e., 

Ktpm  =  aKtp  +  (1  -  <7)Kmem,  (4.9) 

where  0  <  a  <  1.  Similar  to  that  of  the  membrane  spline,  the  matrix  K0  is  chosen 
to  be  AK(pm  + 1,  which  can  be  written  as 

K0    =    \<rK2mem  +  \(l-<T)Kmem+I 
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=    (aKmem+I)(6Kmem+I),  (4.10) 


where 


\{l-a)  +  y/\2{l-a)2-^   A(l-<7)-yA2(l-<7)2-4A<r 
a,b  —  -  i  2  \  '     ) 


When  A  >  (11^2,  a  ana"  6  are  always  real  and  positive.  Then  the  matrix  K0  can  be 
factored  into  two  real  matrices  (aKmem  + 1)  and  (6Kmem  + 1),  which  are  of  the  same 
form  as  the  matrix  K0  for  the  membrane  dense  data  problem.  Thus,  the  solution  to 
a  linear  system  with  the  matrix  K0  can  be  obtained  by  solving  two  Lyapunov  matrix 
equations,  which  can  be  solved  using  the  ADI  method  in  O(N)  time.  However,  when 
the  above  condition  is  violated,  a  and  b  as  well  as  the  factoring  matrices  in  equation 
4.10  become  complex.  It  is  more  complicated  to  use  ADI  method  to  solve  the  complex 
Lyapunov  matrix  equations  and  analyze  the  computational  complexity.  Therefore, 
we  limit  our  discussion  to  the  case  when  the  aforementioned  condition  is  satisfied  for 
the  dense  data  surface  reconstruction  problem. 

Using  the  above  choice  for  the  matrix  K0,  we  can  see  that  the  matrices  U  and 
V  encode  the  inhibition  of  membrane  and  thin-plate  computational  molecules  due  to 
the  presence  of  depth  and  orientation  discontinuities  (if  any)  and  the  imposed  doubly 
periodic  smoothness  over  the  imbedded  rectangular  domain. 

4.2.2     Sparse  Data  Surface  Reconstruction 

The  splitting  of  matrix  K  for  the  sparse  data  surface  reconstruction  problem  is 
discussed  in  this  section.  The  only  difference  between  the  splitting  schemes  for  the 
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sparse  data  problem  and  the  dense  data  problem  is  the  treatment  of  the  data  com- 
patibility matrix  Kj.  For  the  dense  data  case  discussed  above,  the  matrix  Kj  is  an 
identity  matrix  and  is  arranged  into  K0,  which  can  be  rewritten  as  a  Kronecker  prod- 
uct of  two  well-structured  matrices.  For  the  sparse  data  case,  the  nonzero  elements  of 
the  matrix  Kd  are  sparsely  distributed  along  the  diagonal,  thus  every  nonzero  entry 
of  the  sparse  diagonal  matrix  Kd  is  decomposed  into  the  outer  product  of  u(  and  v,' 
and  included  into  U  and  V  as  follows: 


U    = 
V    = 


where  the  vectors  Ui,...,u&,  Vi,...,Vfc  account  for  the  molecule  inhibition  due 
to  depth  and  orientation  discontinuities  (if  any)  and  the  imposed  doubly  periodic 
smoothness  over  the  embedded  rectangular  domain.  With  the  above  choice  of  U  and 
V,  each  column  of  U  and  V  is  still  very  sparse,  containing  only  one  or  two  nonzero 
entries.  Our  numerical  algorithm  will  take  advantage  of  this  sparsity  property. 

Similar  to  the  dense  data  case,  the  matrix  K0  can  be  decomposed  as  the  sum  of 
two  Kronecker  products  of  A  A  and  I.  Therefore,  a  linear  system  of  the  form  K0z  =  f 
can  be  written  as  the  Lyapunov  matrix  equation  AZ  +  ZA  =  F. 

4.3     Numerical  Solution 

In  this  section,  we  discuss  the  convergence  of  ADI  method  applied  to  the  Lya- 
punov matrix  equation  and  show  that  the  computational  cost  for  solving  this  equation 
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is  0(N)  by  proving  that  the  ADI  method  can  converge  in  a  constant  number  of  it- 
erations independent  of  N.  In  addition,  we  present  a  modified  biconjugate  gradient 
method  that  is  used  in  solving  the  capacitance  linear  system. 

4.3.1     ADI  Method  for  the  Lvapunov  Equation 

The  ADI  method  for  solving  a  Lyapunov  matrix  equation  is  described  in  [48]. 
For  any  Lyapunov  matrix  equation  of  the  form  AX  +  XA  =  B,  the  ADI  method 
involves  the  following  steps  in  each  iteration  j  =  1, 2, . . . ,  J, 

{A  +  PjI)XH    =    B-X^A-^I)  (4.12) 

X,(A+PjI)    =    B  -  (A  -  p^Xj.,  (4.13) 

For  our  problem,  note  that  the  matrix  A  is  circulant  Toeplitz  and  symmetric  positive 
semidefinite  as  discussed  in  section  4.2.  The  special  structure  of  matrix  A  helps  in 
advancing  the  ADI  iterations  very  efficiently  in  each  step  of  the  technique.  On  the 
left-hand  side  (LHS)  of  the  equations  4.12  and  4.13  respectively,  the  matrix  A  +  Pjl 
is  very  close  to  being  tridiagonal  except  for  the  nonzero  entries  at  the  corners  namely, 
the  entries  (l,n)  and  (n,l).  These  nonzero  corner  entries  are  caused  due  to  the  doubly 
periodic  boundary  conditions  discussed  in  the  last  section.  In  addition,  A+pjI  is  SPD 
for  any  positive  parameter  pj.  Therefore,  we  can  compute  the  LU  decomposition  of 
this  matrix  and  then  use  the  forward  and  back  substitution  to  compute  the  updated 
solution.  Due  to  the  the  special  structure  of  the  matrix  A  +Pjl,  the  solution  update 
requires  only  O(N)  time  per  iteration. 
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We  now  examine  the  convergence  of  the  ADI  method  for  the  Lyapunov  matrix 
equation  in  our  algorithm.  Let  the  initial  X0  =  0  (zero  matrix),  and  AXt  =  Xt  -X*, 
where  X*  is  the  true  solution  of  the  Lyapunov  matrix  equation,  A0,  Ai, . . . ,  An_x,  and 
qo,qi,  •  •  •  ,qn-i  be  the  eigenvalues  and  the  eigenvectors  of  A  respectively.  For  this 
symmetric  matrix  A,  we  have 

A  =  QDQr,  (4.14) 

where  Q  =  [q0  qi  ■••  q»-i],  D  =  diag[\0,  Xu  ■  •  -,Aw-i],  and  Afc  =  4  sin  (— ^-) 
for  k  —  0, . . .  ,n  —  1.  We  express  the  eigenvalue  Afc  in  this  form  so  as  to  have  nonde- 
creasing  eigenvalues  with  respect  to  the  indices.  The  range  of  eigenvalues  is  between 
0  and  4,  and  the  zero  eigenvalue  (when  k  =  0)  is  simple.  The  orthogonal  matrix 
Q  contains  the  discrete  sine  or  cosine  function  in  each  column  with  increasing  fre- 
quency corresponding  to  increasing  eigenvalue,  i.e.,  the  index  increases.  For  example, 
the  eigenvector  corresponding  to  the  simple  zero  eigenvalue  is  a  flat  function  whose 
frequency  is  0. 

By  taking  the  tensor  product  of  the  eigenvectors,  qt  ®  q/,  k,  I  =  0, . . . ,  n  -  1,  we 
can  generate  an  orthogonal  basis  for  3?"xn  and  express  the  true  solution  X*  in  this 
basis  as  follows: 

x*  =  £l>u(q*®qi)-  (4-15) 

fc=o  ;=o 
Subtracting  each  of  the  equations  4.12  and  4.13  from  the  Lyapunov  equation  AX  +  XA 
=  B,  and  expressing  X  in  the  above  tensor  product  basis,  we  have  the  following  result 
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for  the  error  after  t  iterations, 


n— 1 n— 1 

fc=o  ;=o 


Tt(Xk~Pj)(X'~Pj) 

}£KXk+prh+Pi 


<»m(<i*®<h)-  (4-16) 


The  function  inside  the  bracket  is  the  error  reduction  factor  in  the  basis  q^  ®  q/. 
Notice  that  the  Lyapunov  matrix  equation  has  an  infinite  number  of  solutions  due  to 
the  singularity  of  the  matrix  A.  The  error  reduction  in  each  step  is  always  less  than 
1  for  eigenvectors  with  positive  eigenvalues  and  positive  parameters  pj  while  it  is 
always  1  for  the  basis  qo®qo  (which  is  evident  from  equation  4.16  after  substituting 
k  —  I  =  0  and  A0  =  0).  A  solution  to  the  Lyapunov  matrix  equation  can  be  obtained 
without  reducing  this  factor  at  the  component  k  =  I  =  0. 

The  classical  ADI  minimax  parameter  problem  is  to  find  the  ADI  parameters  pj 
to  minimize  the  maximum  of  the  error  reduction  function  excluding  the  zero  eigen- 
value, i.e.,  to  minimize  the  function 


fxt=     max     |]T(^— EL)(^1_W)|.  (4.17) 


Applying  the  result  from  the  classical  ADI  minimax  analysis  [48],  leads  to  an  0(log  N) 
iterations  for  convergence  [85].  This  in  turn  yields  an  overall  computational  complex- 
ity of  0(N  log  N)  for  the  ADI  method  since,  O(N)  time  is  required  per  iteration  in 
the  ADI  method.  An  0(N  log  N)  computational  complexity  is  unacceptable  for  our 
purposes  (recall  that  we  seek  an  O(N)  solution).  Thus,  in  order  to  design  an  O(N) 
algorithm,  it  is  essential  to  use  the  additional  knowledge  about  the  problem  e.g., 
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spectral  characteristics  of  the  differential  operator  in  the  PDE.  The  classical  ADI 
convergence  analysis  does  not  make  use  of  any  such  knowledge.  In  this  thesis,  we  re- 
formulate the  ADI  convergence  analysis  by  incorporating  the  spectral  characteristics 
of  the  Laplacian  operator.  This  is  facilitated  by  making  the  following  assumption. 

Assumvtion  1  Let  the  projection  ofB,  the  RHS  of  the  Lyapunov  matrix  equation,  on 
the  basis  qk  <8>  q/  be  bki-  Assume  there  exist  a  constant  a  €  3R+  and  a  wm  >  0  such 
that  ,V(M)\(0,0),  IM  <  (P^p  for  some  m  >  -0.5,  and  £fc£,(^)2  >  <*«&• 

For  our  Lyapunov  matrix  equation  AX  +  XA  =  B,  the  relationship  between  the 
solution  X  and  the  data  B  in  the  eigen  basis  of  the  matrix  K0  is  a^  =  j^f^- 
Therefore,  the  assumption  |6W|  <  (fcffi2)m  =»  |aw|  <  (Xk+x^2+l2)m  for  some  m  > 
-0.5.  Using  the  relation  \x  <  sinx  <  x  for  x  €  [0,tt/2],  we  have  cxk2  <  \k  <  c2k2 
with  c\  =  -$2  and  c2  =  %•.  Thus,  requiring  that  the  solution  satisfy  the  constraint 

\aki\  <  (k2+pyn+i  for  some  m  >  -0.5. 

This  assumption  can  be  given  the  following  spectral  interpretation.  The  smaller 
values  k  or  /  correspond  to  lower  frequencies  along  vertical  or  horizontal  direction,  and 
vice  versa.  Consequently,  the  assumption  means  that  the  overall  frequency  content 
of  the  input  data  or  the  right  hand  side  can  be  bounded  by  an  increasing  function 
of  ||u;||2  =  (k2  +  l2)05  of  degree  less  than  1.  This  can  be  easily  satisfied  when  the 
overall  frequency  distribution  of  the  input  data  is  decreasing  or  uniform.  This  is  true 
in  reality  since,  the  low  frequency  content  in  signals  is  normally  much  higher  than 
the  high  frequency  content. 
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We  first  define  two  matrices  X'  and  AX't  for  use  in  the  convergence  analysis 
of  the  ADI  method.  Let  X'  and  AX't  be  the  projections  of  the  true  solution  X* 
and  the  error  in  computed  solution  (in  iteration  t)  AXt  on  the  range  of  the  matrix 
K0  (  =  A  ®  I  + 1  ®  A),  respectively.  Note  that,  X'  and  X*  are  almost  the  same 
except  that  X'  doesn't  have  projection  at  the  basis  q0  ®  qo-  In  the  ADI  literature 
[48],  the  convergence  criterion  is  normally  defined  as  follows:  The  ADI  method  for 
the  Lyapunov  matrix  equation  is  said  to  have  converged  when  the  relative  Frobenius 
norm  ^fjx'll^  1S  re(Iuced  to  within  a  specified  error  tolerance  e  lying  between  0  and 
1,  i.e.,  ||AX't||F  <  e||X'||F.  This  means  we  don't  need  to  reduce  the  error  component 
in  the  basis  corresponding  to  the  zero  eigenvalue  or  in  the  null  space  of  K0.  The 
problem  is  to  find  the  number  of  iterations  required  in  the  ADI  method  and  the 
associated  ADI  parameter  set  to  achieve  ||AX'j||F  <  e||X'||F. 

We  make  use  of  this  convergence  criterion  and  assumption  1  to  reformulate  the 
convergence  theory  of  the  ADI  method  as  follows.  The  Frobenius  norm  ||  AX't||F  can 
be  written  as, 

where  the  double  summation  takes  every  k  and  /  between  0  and  n  —  1  except  k  = 
I  =  0  (simultaneously).  As  a  consequence  of  assumption  1  the  ADI  method  for  our 
Lyapunov  matrix  equation  can  converge  in  a  constant  number  of  iterations.  We  state 
and  prove  this  convergence  result  in  the  following  theorem. 

Theorem  4  Given  assumption  1  and  an  error  tolerance  t  between  0  and  1,  there  exists 
a  constant  integer  t  such  that  Vn;  ||  AX't||F  <  e||X'||F. 


hi    ^«Vy^   hL_M^*2  (419) 

16 
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Proof:    From  the  discussion  on  assumption  1,  we  have  the  following  inequality  for 
IIX'IIf. 

Applying  assumption  1  to  equation  4.18,  we  have  the  following  inequality 


2  ii 


where  q  is  a  positive  number  between  0  and  2m  +  1  and 


**•» ^(^fi^O'         (420) 


Using  the  fact  that  \k  =  4sin2(^-^)  >  4(£)2  for  A;  >  0  and  substituting  into  the 
above  inequality,  we  have 


l^XMI^<p=^^^^{((i)a  +  (|)a)^+a_,(^)/^^ P»)},     (4-21) 


Then  we  have, 

(4.22) 
Note  that  the  double  summation  on  the  RHS  of  equation  4.22  takes  every  integer  k 
and  /  between  0  and  n  —  1  except  k  =  I  =  0  simultaneously.   Let  the  set  of  all  the 
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pair  (k,  I)  in  this  double  summation  be  denoted  by  A.  By  using  the  split  in  the  set 


A,  we  have 


EE/(M)i  =  i{/(fl.i)+/(M)+/(MHEM»)+2M0+  £  /(*.0>. 

77  "2       "  k=2  1=2  (k,i)eB 

(4.23) 

where  /(*,/)  =  ((i)a+(£)»)»"+2-«  and  5  =  (M)\(M)  \  k,l  =  l,...,n  -  1.  By  using 
the  fact  that  /(fc,  /)  is  a  strictly  decreasing  function  of  k  and  /,  we  can  bound  the 
terms  with  summation  in  equation  4.23  by  the  corresponding  terms  in  an  integration 
form  as  follows. 


EEMOi  <  {/(o,i)  +  /(i,o)  +  /(i,i)}-j  +  -/""/(x,o)& 

4  /*  /(0, ,)  *  +  /  /s  (x2  +  ;)w„  **.    (4.24) 

where  the  integration  domain  5  =  {(a;,y)|(a?,y)  6  [0,1]  x  [0,1]\[0,J)  x  [0,J)>.  By 
transforming  the  Cartesian  coordinate  to  the  polar  coordinate,  the  term  with  double 
integral  in  equation  4.24  can  be  bounded  by  the  following  integral, 

/f    if     S     \    o    rdrdO      =         /n         -^ {n4™+2-2q_22m+l-q) 

Jo    Jt     r4m+4~2"  4(2m  +  1  -  q) V  ' 

< -n4m+2-2'.  (4.25) 

4(2m  +  l-q)  K        ' 
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Substituting  equation  4.25  into  equation  4.24  and  calculating  the  integrals  in  equation 
4.24,  we  obtain  the  following  inequality 


T  T  -r r 4  <  Pn4m+2-2\  (4.26) 

r  ,  (&  +  &)2m+2-q  n2 


where  fi  =  2  +  ^^^  +  j^z^  +  A(2mlx.q)  is  a  constant. 

Using  equations  4.19,  4.22  and  4.26,  the  convergence  criterion  ||  AXt||F  <  c||X'||f 
can  be  met  with  by  the  following  derived  condition 


rvf24-' 


where  M  is  the  constant  defined  as  M  :=  at  I      Furthermore,  we  can  bound  the 
function  maxfcj/\(0,o)  R{pi,P2,  ■■■,Pt)  by  a  function  of  A*  as  follows: 

max     R{pu...,Pt)    =        max    (,.     *  f[(h!LZlL)*)m(  ^^)») 

v(M)\(o,o)    v^'       *w  v(fc,o\(o,o)v(Afe  +  A,)«/ii  Afc+Pj '  A£J  X'+Pj 


s                          *     TTf^k  ~  Pi\2 
<       max     Tyllt-T ) 


The  maximum  taken  from  the  values  of  function  R  at  the  nonzero  eigenvalues, can 
be  bounded  by  using  the  maximum  of  the  same  function  in  an  interval  containing  all 
the  nonzero  eigenvalues.  Then  the  convergence  can  be  reached  by  using  the  following 
modified  requirement 

max  -  fK^1^-)2  <  Mn2\  (4.28) 

A1=4sin2(f)<x<4      Xf  fj[    X  +pj 
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For  any  positive  pj,  the  function  j^UUii^)2  is  always  less  than  the  right-hand 


si 


ide  when  a;  >  — f— .  Therefore,  in  equation  4.28,  the  interval  (4 sin  (f),4)  can  be 


Mln2 


replaced  by  S  =  (4sin2(-),mm(4,  —h-)).  The  convergence  requirement  in  equation 

n  Mln2 

4.28  can  be  made  less  stringent  by  using  the  inequality, 


.  L  TT(^-^)2  <  (max  TT(^-^)2)(max- 


max_L]-T(^_^)2  <  (maxTT(^_^)2)(max_)  (4.29) 

xes  xi+^x  +  v;      ~    *es  ±AKx  +  Vi     A  *€S  x*" 


leading  to  the  following  modified  requirement  which  when  satisfied  automatically 
makes  the  inequality  4.28  true. 


maxfK^-^)2  <  (Mn2')( —. -)  =  (Mn2')(4sin2(-))'  =  M'         (4.30) 


This  bound  M'  will  approach  a  constant  when  n  — ►  oo.  Thus,  the  problem  becomes 
one  of  finding  the  number  of  iterations  t  and  the  parameters  pj  such  that  the  re- 
quirement in  4.30  is  satisfied.  This  requirement  is  of  the  same  form  as  the  classical 
ADI  minimax  problem  which  was  solved  by  Jordan  as  quoted  in  [87].  We  can  use  the 
result  in  [87]  to  determine  the  number  of  iterations  required  for  convergence.  The 
number  of  iterations  needed  to  meet  this  requirement  is 


lnJLin± 
j    =    f2LMpLl  (4.31) 

k'      =      )=*— >  (4-32) 

V'  +  vV2  -  1 
v'    =    \(»  +  \),  (4-33) 
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where,  \z\  denotes  the  smallest  integer  larger  than  z  and  v  is  the  expansion  factor  of 
an  interval  (a,  b)  defined  as  |.  The  expansion  factor  v  of  the  interval  of  interest  (S) 
is  fixed  when  n  approaches  infinity,  i.e. 


lim  v  =  -r^nUZ-  (°4) 


In  the  classical  ADI  minimax  formulation,  u  is  the  spectral  radius  of  the  eigenvalues 
being  considered  in  A,  which  is  shown  be  of  the  order  of  n2  [85].  Since  we  use 
a  different  formulation  for  convergence  and  impose  the  spectral  assumption  on  the 
data,  the  resulting  v  becomes  m%^Jm  •  When  n  -»  oo,  the  ratio  v  approaches  a 
constant  as  given  in  equation  4.34  and  M'  also  approaches  a  constant  as  mentioned 
above.  Therefore,  we  conclude  that  lim^oo  J  is  a  constant,  i.e.  there  is  a  constant  t 
such  that  ||AX't||F  <  e||X'||F  is  satisfied  for  all  n.     ■ 

In  [87],  an  optimal  set  of  ADI  parameters  for  the  general  ADI  iterations  was 
given.  We  may  use  these  parameters  to  solve  the  minimax  problem  in  equation  4.30. 
Although  this  set  of  ADI  parameters  is  not  the  optimal  set  to  meet  our  convergence 
requirement,  it  provides  us  with  a  parameter  set  which  can  achieve  the  convergence 
criterion  in  a  constant  number  of  iterations.  Therefore,  we  use  the  result  in  [87]  to 
select  a  suboptimal  ADI  parameter  set.  Solving  for  the  optimal  ADI  parameters  in 
our  problem  requires  complicated  analysis  using  transformation  theory  [87]  and  is 
currently  under  investigation. 
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4.3.2     Modified  BCG  Method  for  the  Capacitance  Matrix  Equation 

In  this  section  we  present  an  algorithm  to  solve  the  capacitance  matrix  equa- 
tion defined  earlier.  The  conjugate-gradient(CG)  method  is  a  very  powerful  iterative 
scheme  to  solve  the  symmetric  positive  definite  linear  system.  A  natural  generaliza- 
tion of  the  CG-type  method  to  solve  a  general  nonsymmetric  linear  system  is  called 
the  biconjugate  gradient(BCG)  method  [46].  The  linear  system  with  the  capacitance 
matrix  is  a  nonsymmetric  and  indefinite  system.  The  size  of  the  capacitance  ma- 
trix is  determined  by  the  rank  of  the  matrix  UVr,  which  contains  the  difference 
between  K  and  Ko-  With  the  K0  proposed  in  section  4.2,  the  size  of  the  associated 
capacitance  matrix  is  usually  0(n).  The  BCG  method  requires  the  computation  of  a 
matrix- vector  product  in  each  iteration.  Since  the  capacitance  matrix  C  is  dense,  this 
matrix-vector  product  takes  0(N)  operations.  In  this  section,  we  present  a  modified 
biconjugate  gradient  method  to  solve  this  linear  system  by  applying  the  wavelet  trans- 
form technique  to  approximate  the  multiplication  and  reduce  the  computational  cost 
to  0(n).  Since  the  BCG  converges  in  0(n)  iterations,  the  computational  complexity 
for  the  modified  BCG  method  is  0(n2)  =  0{N). 

The  BCG  algorithm  [46]  for  the  nonsymmetric  linear  system  C/3  =  f  is  as 
follows. 

0.  Choose  /30,  and  set  q0  =  r0  =  f  -  C/30;  Choose  f0  /  0,  and  set  q0  =  r0; 

1.  Compute  a%  =  r^r-fc-^af  =  q^Cq^x,  and  ak  =  a%/a%; 

2.  Update  (3k  =  (3k_x  +  a*qfc_i; 
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3.  Set  rk  =  r*_i  -  atCqA_i,  and  rk  =  ffc-i  -  akCTqk-i; 

4.  Compute  p%  =  rjfr*,  and  pk  =  pk/(xk; 

5.  Set  qk  =  rk  +  pkqk-i,  and  qfc  =  rfe  +  />*qjb-i; 

6.  If  rk  w  0  or  r*  «  0,  stop;  else  go  to  1. 

Like  the  CG,  the  operations  and  storage  requirement  per  step  in  the  BCG  are  con- 
stant. The  convergence  behavior  is  similar  to  the  CG  method  except  that  the  BCG 
algorithm  is  susceptible  to  breakdowns  and  numerical  instabilities.  To  be  more  spe- 
cific, division  by  0  (or  a  number  very  close  to  0)  may  occur  in  computing  ak  and 
pk.  These  two  different  breakdowns  may  be  avoided  by  using  the  look-ahead  Lanczos 
algorithm  or  the  quasi-minimal  residual(QMR)  algorithm  [19]. 

In  the  above  algorithm,  two  matrix- vector  multiplications  Cq^-i  and  Cq*_i  are 
required  in  each  iteration.  Since  the  matrix  C  is  dense,  the  direct  multiplication  will 
take  O(N)  operations,  which  is  not  acceptable  to  our  pursuit  of  an  O(N)  algorithm 
for  the  Poisson  equation.  Thus,  we  propose  to  carry  out  the  multiplication  in  a 
wavelet  basis  to  reduce  the  computational  cost  [1,  7].  With  appropriate  wavelet  basis 
[7],  the  decomposed  matrix  can  be  approximated  to  high  accuracy  by  a  sparse  matrix. 
After  the  approximation,  the  multiplication  of  this  sparse  matrix  and  the  decomposed 
vector  can  be  reduced  to  0(n)  or  0(n\og(n))  operations  [7].  Consequently,  the 
modified  BCG  method  is  computationally  efficient. 

The  wavelet  transform  of  a  matrix  D  and  vector  q  can  be  written  as  follows: 


D  =  $D$T,    q  =  $q,  (4.35) 
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where  #  is  the  wavelet  transform  matrix  with  each  row  corresponding  to  a  wavelet 
basis  vector.  If  we  use  the  orthogonal  wavelet  transform,  then  Dq  =  <P  (Dq). 
Therefore,  we  compute  matrix- vector  multiplication  in  the  wavelet  domain  and  then 
transform  back  to  the  original  domain  by  using  the  wavelet  reconstruction  (inverse 
wavelet  transform)  to  obtain  the  product  Dq. 

It  is  very  crucial  to  know  the  structure  of  the  capacitance  matrix  to  achieve  an 
accurate  and  efficient  approximation  of  the  matrix.  In  equation  3.4,  the  capacitance 
matrix  is  composed  of  three  matrices,  i.e.,  the  identity  matrix  I,  —Y^T=o  ei+iH1hr 
and  VTr/.  The  first  two  matrices  are  extremely  sparse,  but  the  third  is  a  dense 
matrix.  When  computing  the  matrix-vector  product  of  the  capacitance  matrix  and 
a  vector,  we  directly  compute  the  separate  products  for  the  first  two  sparse  matrices 
while  the  aforementioned  wavelet  approximation  scheme  is  used  only  for  the  matrix 
VTrj. 

4.4     Experimental  Results 

We  have  successfully  applied  our  generalized  capacitance  matrix  algorithm  to 
a  host  of  early  vision  problems  which  use  the  membrane  and/or  thin-plate  spline 
to  constrain  the  solution  [85,  45].  These  early  vision  problems  include  the  surface 
reconstruction,  shape  from  shading,  optical  flow  computation,  lightness  problem, 
etc.  [85,  45].  These  problems  when  formulated  as  variational  principles  and  dis- 
cretized,  lead  to  solving  one  or  more  large  sparse  linear  systems.  In  this  section, 
we  present  the  results  of  applying  our  algorithm  to  the  surface  reconstruction  and 
shape  from  shading  problems. 
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4.4.1     Surface  Reconstruction 

We  implemented  our  generalized  capacitance  matrix  algorithm  on  the  sparse 
data  surface  reconstruction  with  membrane  and  thin-plate  splines.  We  synthesized  a 
sparse  range/elevation  data  set  on  a  64  x  64  grid.  The  input  data  set  is  very  sparse 
and  contains  15  data  points  randomly  scattered  in  the  plane  as  shown  in  figure  4.3(a). 
We  apply  our  algorithm  to  this  data  to  obtain  a  surface  approximation. 

In  the  first  experiment,  we  reconstruct  the  surface  to  approximate  this  sparse 
data  set  by  using  membrane  and  thin-plate  smoothness  constraints.  Figure  4.3(b) 
shows  the  reconstructed  surface  using  membrane  smoothness  after  10  ADI  iterations. 
We  observe  that  only  10  ADI  iterations  are  needed  to  attain  an  error  tolerance  of 
10-5.  The  reconstructed  surface  using  thin-plate  smoothness  after  16  ADI  iterations 
is  shown  in  figure  4.3(c).  We  can  see  the  reconstructed  surface  using  thin-plate 
smoothness  is  much  smoother  than  that  using  membrane  smoothness. 

The  second  experiment  is  the  surface  reconstruction  on  the  same  data  set  with 
prespecified  discontinuities  along  a  horizontal  line,  between  the  coordinates  (1,32) 
and  (30,32).  as  shown  in  figure  4.4(a).  We  depict  the  computed  surface  with  mem- 
brane smoothness  obtained  using  our  generalized  capacitance  matrix  algorithm  with 
1  and  10  ADI  iterations  in  figure  4.4(b)  k  (c)  respectively.  Figure  4.4(d)  shows  the 
result  of  the  same  experiment  with  thin-plate  smoothness  after  16  ADI  iterations 
in  our  generalized  capacitance  matrix  algorithm.  A  similar  experiment  with  depth 
discontinuities  along  the  diagonal  line  segment  extending  from  the  coordinate  (1,63) 
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M^ 


(a) 


(b) 


(c) 


Figure  4.3.  Surface  reconstruction  example;  (a)  original  sparse  data,  (b)  recon- 
structed surface  using  membrane  smoothness,  (c)  reconstructed  surface  using  thin- 
plate  smoothness,  obtained  using  the  generalized  capacitance  matrix  algorithm. 
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to  (63, 1)  was  performed.  The  data  along  with  the  recovered  surface  using  membrane 
smoothness  are  shown  in  figure  4.5(a)  &  (b)  respectively. 


(a) 


(b) 


(c) 


(d) 


Figure  4.4.  Surface  reconstruction  with  prespecified  discontinuities  along  a  horizon- 
tal line;  (a)  original  sparse  data  set  with  depth  discontinuities  along  the  indicated 
horizontal  line;  reconstructed  surfaces  using  a  membrane  spline  after  (b)  1  iteration 
and  (c)  10  iterations  of  ADI  method;  (d)  reconstructed  surface  using  a  thin-plate 
spline  after  16  iterations  of  ADI  method. 


4.4.2     Shape  from  Shading 

In  this  problem,  we  tested  our  algorithm  on  a  synthetically  generated  image  of 
a  Lambertian  sphere  illuminated  by  a  distant  point  source  and  viewed  from  the  same 
location  as  the  point  source.  Figure  4.6(a)  depicts  the  64  x  64  original  shaded  image 
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(a) 


(b) 


Figure  4.5.  Surface  reconstruction  with  prespecified  discontinuities  along  a  diagonal 
line;  (a)  the  original  sparse  data  set  with  depth  discontinuities  along  the  indicated 
diagonal  line,  (b)  reconstructed  surface  using  a  membrane  spline  after  10  iterations 
of  ADI  method. 

with  an  8  bit  resolution  in  gray  per  pixel.  The  sphere  is  assumed  to  have  Lambertian 

reflectance  properties  with  a  constant  albedo.  Both  depth  and  orientation  data  are 

specified  on  an  occluding  contour  depicted  in  figure  4.6(b).    Note  that  we  have  to 

solve  three  systems  of  equations  namely  Kxi  =  bi,  KX2  =  b2  and  K^  =  b3  (see 

section  2.3).  The  first  two  systems  are  for  recovering  the  surface  normals  and  the  last 

equation  for  recovering  the  depth  from  the  surface  orientation  map.  We  start  the  ADI 

iteration  for  the  linear  systems  with  the  initial  values  of  p  —  q  =  z  =  0.  The  recovered 

surface  orientation  is  shown  in  figure  4.6(c)  and  the  final  surface  shape  obtained  using 

the  orientation  information  is  depicted  in  figure  4.6(d).  For  each  computed  (p,q)  on 

the  RHS  of  equation  2.18,  the  ADI  error  tolerance  was  set  to  10-5.  The  same  tolerance 

was  used  for  the  ADI  iterations  in  the  depth  from  orientation  problem. 
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Although,  we  have  not  compared  the  computational  time  of  existing  solution 
methods  for  the  discretized  Poisson  equation  with  the  computational  performance 
of  our  algorithm,  we  believe  that  our  method  will  outperform  most  other  existing 
methods.  Future  research  efforts  will  focus  on  empirically  demonstrating  this  claim. 
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Figure  4.6.  Shape  from  shading  example,  (a)  Shaded  sphere  image  (b)  irregular 
occluding  boundary  imbedded  in  a  rectangle  (c)  recovered  surface  orientation,  (d) 
reconstructed  surface  shape. 


CHAPTER  5 
ADAPTIVE  PRECONDITIONING  TECHNIQUE 

5.1     Introduction 

The  generalized  capacitance  matrix  algorithm  presented  in  chapter  4  is  very 
efficient  to  solve  the  linear  systems  arising  from  the  early  vision  problems.  However, 
this  technique  can  prove  to  be  inefficient  for  dense  data  problems  with  nonuniform 
weighting  on  the  data  constraints,  such  as  the  optical  flow  problem,  since  the  size 
of  the  associated  (  dense  )  capacitance  matrix  is  too  large.  Although  this  problem 
may  be  circumvented  by  incorporating  the  capacitance  matrix  technique  as  part  of 
an  iterative  scheme  [72],  however  it  was  pointed  out  in  [12]  that  this  semi-direct 
numerical  scheme  only  converges  when  the  regularization  parameter  A  is  very  large. 
In  this  chapter,  we  present  a  novel  adaptive  preconditioning  technique  which  when 
used  in  conjunction  with  a  conjugate  gradient  algorithm  can  solve  the  linear  systems 
very  efficient.  This  adaptive  preconditioning  technique  can  be  universally  applied  to 
solve  any  early  vision  problem. 

Numerical  iterative  methods  have  been  popular  for  solving  the  linear  systems  in 
early  vision  problems.  Several  multiresolution  preconditioning  techniques  [75,  61,  92] 
have  been  proposed  to  accelerate  the  convergence  speed  of  the  iterative  methods  in 
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computer  vision  literature.  In  this  chapter,  we  will  introduce  a  physically-based  adap- 
tive preconditioning  technique  which  will  outperform  -  in  computational  efficiency 
-  all  previously  proposed  preconditioning  methods  in  vision  literature  namely,  the 
hierarchical  basis  preconditioned  conjugate  gradient  of  Szeliski  [75]  and  the  change 
of  basis  methods  of  Yaou  et  al.  [92]  and  Pentland  [61].  In  addition,  it  does  not  have 
any  restriction  on  the  choice  of  the  regularization  parameter  A  for  its  convergence. 

In  [75,  76],  Szeliski  presented  a  preconditioning  technique  based  on  the  use  of 
hierarchical  basis  functions.  This  technique  is  based  on  the  multi-level  splitting  of  the 
finite  element  spaces  presented  in  [93].  The  hierarchical  basis  functions  naturally  lead 
to  a  pyramidal  representation  of  the  unknown  function  being  solved  for.  Empirical 
convergence  results  were  shown  for  preconditioned  conjugate  gradient  in  comparison 
with  the  standard  conjugate  gradient  method.  A  key  piece  of  information  that  was 
overlooked  in  Szeliski 's  work  [75,  76]  was  that,  in  designing  a  preconditioner  for  the 
conjugate  gradient  algorithm,  no  information  about  the  problem  being  solved  was 
used  i.e.,  once  a  basis  set  was  chosen,  the  same  preconditioner  was  used  regardless 
of  the  imposed  smoothness  or  data  constraints.  Thus,  the  true  power  of  a  precondi- 
tioning transform  was  not  exploited  to  its  fullest. 

More  recently,  Pentland  [61]  presented  a  technique  for  surface  interpolation  us- 
ing a  change  of  basis  to  orthonormal  wavelets  as  a  preconditioning  transform.  He 
uses  the  results  of  Beylkin  et  al.,  [7]  which  states  that  the  application  of  (N  x  N) 
matrices  -  corresponding  to  any  pseudo-differential  and  Calderon-Zygmund  opera- 
tors -  to  arbitrary  vectors  requires  either  O(N)  or  0(N  log  N)  operations  based  on 
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whether  nonstandard  or  standard  wavelet  transform  is  used.  The  result  of  Beylkin 
et  al.,  primarily  shows  a  reduction  in  the  bandwidth  of  the  matrices  corresponding 
to  pseudo-differential  operators  in  a  wavelet  basis  and  gives  a  technique  whereby  an 
0(N)  coefficients  can  be  used  to  approximate  the  (N  x  N)  operator,  for  a  given 
tolerance.  This  compression  takes  0(N)  time  if  the  structure  of  the  singularities  of 
the  matrix  are  known  a  priori. 

There  are  three  issues  to  be  noted  with  regards  to  Pentland's  reported  results. 
Firstly,  after  a  change  of  basis  to  orthonormal  wavelets,  the  off-diagonal  terms  were 
discarded  in  [61],  under  the  claim  that  the  transformed  stiffness  matrix  was  diagonally 
dominant.  This  diagonal  dominance  behavior  was  depicted  via  the  profiling  of  a  row 
of  the  transformed  stiffness  matrix  at  a  high  resolution.  Upon  close  examination, 
we  found  that  the  diagonal  dominance  property  does  not  hold  in  general  at  lower 
resolutions  (see  figure  5.10)  and  thus,  discarding  the  off-diagonal  terms  is  unjustified 
and  leads  to  a  solution  which  may  be  far  from  the  true  solution.  Secondly,  Pentland's 
preconditioner  requires  the  computation  of  the  diagonal  entries  in  a  wavelet  trans- 
formed stiffness  matrix.  Although  the  stiffness  matrix  is  banded,  the  computation  of 
these  diagonal  entries  takes  0(N  log  N)  operations,  which  is  too  expensive  for  large 
N.  Thirdly,  it  is  not  known  if  the  matrix  (K  +  S)  in  his  linear  system  (K  +  S)U  =  D 
being  solved  for  the  interpolation  problem  would  satisfy  the  conditions  set  forth  in 
Beylkin  et  al.,  [7]  in  order  for  their  bandwidth  reduction  scheme  to  be  applicable  in 
this  case.  Further,  we  found  that  his  algorithm  fails  to  converge  in  reasonable  number 
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of  iterations  to  the  correct  solution  for  very  sparse  data  that  is  used  in  the  examples 
of  this  thesis. 

Yaou  and  Chang  [92]  report  a  fast  surface  interpolation  algorithm  using  the 
multiresolution  wavelet  transform.  This  work  is  almost  exactly  similar  to  that  of 
Szeliski  with  the  exception  that  Yaou  and  Chang  use  a  wavelet  basis  instead  of  the 
hierarchical  basis  and  a  Jacobi  iteration  [26]  instead  of  the  conjugate  gradients  used  in 
Szeliski  [75].  As  in  Szeliski,  the  full  potential  of  a  preconditioning  technique  was  not 
exploited  in  Yaou  and  Chang  i.e.,  the  design  of  the  preconditioner  did  not  make  use  of 
any  information  about  the  problem  being  solved.  The  stiffness  matrix  in  the  surface 
interpolation  problem  was  first  transformed  to  a  bi-orthonormal  wavelet  basis  and 
then  the  transformed  linear  system  was  solved  using  a  Jacobi  iteration.  Note  that, 
in  a  preconditioning  technique  applied  to  the  the  positive  definite  stiffness  matrix  K, 
one  tries  to  approximate  K  as  well  as  possible  with  another  positive  definite  matrix  P 
known  as  the  preconditioner,  such  that  P_1K  is  a  good  approximation  to  the  Identity 
matrix.  Such  an  approximation  P  was  never  constructed  in  Yaou  and  Chang  [92]. 
Further,  upon  implementing  their  algorithm,  it  was  found  that  their  published  results 
about  rate  of  convergence  were  only  valid  for  data  from  periodic  function  since  their 
implementation  took  advantage  of  periodic  wavelet  transforms. 

In  this  chapter,  we  will  present  a  very  effective  way  to  construct  a  physically- 
based  adaptive  preconditioner  in  a  wavelet  basis  for  several  early  vision  problems 
namely,  the  surface  reconstruction  (SR),  shape  from  shading  (SFS)  and  computation 
of  optical  flow  (OF).  This  physically-based  preconditioner  is  adapted  to  the  membrane 
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spline  or  the  thin  plate  spline  or  a  convex  combination  of  the  two.  Experimental 
results  depicting  the  performance  of  our  algorithm  in  comparison  to  existing  methods 
discussed  above  are  presented  for  synthetic  and  real  data. 

The  rest  of  this  chapter  is  organized  as  follows.  In  section  5.2,  an  adaptive 
preconditioner  constructed  in  a  wavelet  basis  is  presented,  followed  by  the  adaptive 
preconditioned  conjugate  gradient  algorithm  in  section  5.3.  Algorithm  implementa- 
tion on  synthetic  and  real  data  are  presented  in  section  5.4.  In  section  5.5,  we  end 
with  a  discussion. 

5.2     Adaptive  Preconditioning  in  a  Wavelet  Basis 

Iterative  methods  are  preferable  to  direct  methods  for  solving  large  and  sparse 
linear  systems  in  early  vision  problems,  since  direct  methods  require  the  formation 
of  intermediate  matrices  which  are  not  necessarily  sparse.  Although  the  sparsity 
can  be  preserved  when  using  direct  methods  on  ID  problems  with  the  smoothness 
constraint,  it  will  be  destroyed  in  the  case  of  2D  problems.  This  makes  the  number 
of  operations  and  the  amount  of  storage  too  large  for  2D  and  higher  dimensional 
problems. 

The  efficiency  of  applying  an  iterative  method  to  solve  a  linear  system  depends 
on  its  rate  of  convergence,  which  is  a  function  of  the  condition  number  of  the  coef- 
ficient matrix  in  a  linear  system.  Unfortunately,  many  early  vision  problems  after 
regularization  lead  to  large  ill-conditioned  linear  systems.  Therefore,  most  itera- 
tive methods  converge  very  slowly  to  the  true  solution  and  are  thus  computationally 
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impractical.  One  way  to  speed  up  the  convergence  of  an  iterative  method  is  via  pre- 
conditioning. In  this  section,  we  present  a  new  adaptive  preconditioning  technique 
wherein  the  preconditioner  is  constructed  in  a  wavelet  basis.  Since  the  proposed 
preconditioner  is  designed  to  adapt  to  the  spectral  characteristics  of  the  smoothness 
constraint  in  the  regularization  solution  (in  the  frequency  domain)  of  different  early 
vision  problems,  it  is  bound  to  perform  better  than  the  existing  "fixed"  precondition- 
ed [75,  42,  92]. 

5.2.1     Preconditioning  Technique 

For  a  linear  system  Kx  =  b  with  positive-definite  matrix  K,  we  assume  there 
exists  a  positive-definite  matrix  P,  the  preconditioner  for  the  matrix  K,  such  that  the 
condition  number  of  P_1K  is  greatly  reduced.  Since  P  is  a  positive-definite  matrix, 
it  can  be  shown  that  there  exists  a  positive-definite  matrix  C  such  that  C2  =  P  [26]. 
Now,  we  can  define  the  vectors  x  and  b  via  the  transformation  matrix  C  as  follows 

x  =  Cx,     b  =  C_1b.  (5.1) 

Thus,  the  linear  system  Kx  =  b  can  be  solved  by  solving  the  transformed  linear 
system 

Kx  =  b,  (5.2) 

where  K  =  C_1KC-1.  Note  that  the  matrix  K  in  the  transformed  linear  system 
is  similar  to  the  matrix  P-1K,  i.e.,  the  condition  numbers  of  K  and  P_1K  are 
equivalent.   The  condition  number  of  the  transformed  matrix  is  much  smaller  than 
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that  of  the  original  matrix  according  to  the  aforementioned  criterion  of  choosing  a 
preconditioner  P.  Consequently,  the  convergence  of  the  iterative  methods  can  be 
accelerated  by  a  transformation  of  the  original  linear  system  to  a  better  conditioned 
linear  system. 

Although  the  conditioning  of  a  linear  system  can  be  improved  after  an  appro- 
priate transformation,  it  is  not  practical  to  form  the  matrix  K  to  solve  equation 
5.2  since,  the  transformation  of  a  matrix  requires  a  large  number  of  operations  and 
the  sparsity  structure  of  the  matrix  K  could  be  destroyed  after  the  transformation. 
Fortunately,  it  is  not  necessary  to  compute  this  transformation  of  the  matrix  K  to 
achieve  the  preconditioning  effect  in  a  conjugate  gradient  algorithm.  Instead,  the 
same  preconditioning  can  be  achieved  via  solving  an  auxiliary  linear  system  Pz  =  r, 
where  r  is  the  residual  vector,  in  each  conjugate  gradient  iteration.  (The  precondi- 
tioned conjugate  gradient  algorithm  is  given  in  section  5.3.)  Therefore,  the  second 
consideration  for  a  good  preconditioner  is  to  choose  the  matrix  P  such  that  the  linear 
system  Pz  =  r  can  be  easily  solved. 

The  above  two  criteria  for  designing  a  good  preconditioner  in  conjugate  gradient 
are  equally  important.  Reducing  the  condition  number  can  accelerate  the  convergence 
of  the  iterative  methods  employed.  Choosing  the  preconditioner  P  such  that  Pz  =  r 
is  easily  solvable  is  very  crucial  for  efficiently  updating  the  solution  in  each  iteration 
of  the  preconditioned  conjugate  gradient  (PCG)  algorithm.  Our  preconditioner  is  de- 
signed in  a  wavelet  basis  by  approximating  the  spectral  characteristics  of  the  matrix 
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K.  This  approximation  is  achieved  via  the  frequency  domain  analysis  of  the  regu- 
larization  operators  and  using  the  frequency  band  partitioning  behavior  of  a  wavelet 
basis.  Therefore,  our  adaptive  preconditioner  is  akin  to  a  spectrally  equivalent  op- 
erator for  the  original  matrix  K.  Furthermore,  the  solution  of  a  linear  system  using 
our  adaptive  preconditioner  can  be  obtained  by  taking  the  wavelet  transform,  scaling 
the  coefficients,  and  taking  the  inverse  wavelet  transform.  By  using  a  very  compactly 
supported  wavelet,  the  solution  update  can  be  achieved  very  efficiently. 

5.2.2     Regularization  Filter 

The  filtering  interpretation  of  regularization  operators  has  been  discussed  in 
vision  literature  [80,  83].  The  inverse  of  the  control-continuity  stabilizer  (Laplace  + 
Bi-harmonic  operators)  may  be  interpreted  as  a  low-pass  filter  hence,  the  solution  to 
an  early  vision  problem  in  a  regularization  framework  can  be  obtained  by  applying 
this  low-pass  filter  to  the  data.  We  will  use  this  filtering  behavior  of  the  regularization 
operator  to  develop  a  preconditioner  which  approximates  this  filter  in  a  wavelet  basis. 

In  section  2.3,  we  have  seen  the  variational  formulations  of  various  early  vision 
problems  that  lead  to  minimizing  an  energy  functional.  This  minimization  can  be 
achieved  by  solving  the  associated  Euler-Lagrange  equation  which  for  the  surface 
reconstruction  problem  in  the  continuous  data  case  (with  a  controlled-continuity 
stabilizer)  can  be  written  as 

p(x){(l  -  r(x))A2u(x)  -  t(x)Au(x)}  +  c(x)u(x)  =  c(x)d(x),  (5.3) 
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where  A  is  the  Laplace  operator,  A2  is  the  bi-harmonic  operator,  c(x)  is  the  weight- 
ing function  associated  with  the  data  d(x),  the  functions  p(x)  and  r(x)  control  the 
rigidity  and  tension  of  the  solution  surface  respectively.  Now  let's  consider  the  case 
when  p(x)  and  t(x)  are  constant  functions,  i.e.  p(x)  =  p  and  r(x)  =  r.  In  addition, 
let's  also  assume  the  function  c(x)  to  be  a  constant  function,  which  means  the  Gaus- 
sian noise  associated  with  each  data  measurement  has  the  same  variance.  Taking  the 
Fourier  transform  of  equation  5.3  we  get 

VM  =  ( r 5 I  V(u).  (5.4) 

Where  w  =  (uu  •••  ,u>d)  is  the  d-dimensional  frequency  domain  variable  while,  V(a;) 
and  X>(w)  are  the  Fourier  transforms  of  v(x)  and  d(x)  respectively.  Equation  5.4  can 
be  given  a  filtering  interpretation  namely,  V(w)  is  obtained  by  low-pass  filtering  the 
data  V(uj).  This  low-pass  filter  is  characterized  by  the  first  term  on  the  right  hand 
side  in  equation  5.4  called  the  transfer  function.  Our  preconditioner  is  constructed 
in  a  wavelet  basis  to  approximate  this  transfer  function. 

Equation  5.4  expresses  a  linear  shift-invariant  low-pass  filtering  of  the  data  in 
frequency  domain,  derived  from  a  regularization  framework.  This  filtering  interpre- 
tation can  be  generalized  to  the  regularly  sampled  discrete  data  case  in  a  straight- 
forward manner.  However,  this  linear  shift-invariant  filter  will  become  a  nonlinear 
and  spatially  varying  filter  in  the  general  cases  when  the  functions  p(x),  r(x)  and 
c(x)  are  not  constant.  These  cases  include  discontinuity-preserving  regularization 
and  sparse-data  interpolation.   Although  the  corresponding  nonlinear  and  spatially 
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varying  filter  is  difficult  to  characterize  fully  in  the  frequency  domain,  it  is  in  general 
a  low-pass  filter  and  we  can  approximate  its  spectral  characteristics  in  the  frequency 
domain.  In  the  next  section  we  show  how  to  achieve  this  approximation  in  a  wavelet 
basis. 

5.2.3     Wavelet  Basis 

Wavelet  transform  has  been  used  widely  in  the  signal  and  image  processing 
communities  since  it  provides  a  powerful  tool  for  multiresolution  analysis.  The  con- 
struction of  a  wavelet  basis  is  achieved  by  defining  a  scaling  function  <f>(x),  such  that 
the  integer  translates  of  the  dilated  scaling  function  \p2r* ~<£(2~J x)  form  an  orthonor- 
mal  basis  of  L2(5R),  where  j  €  Z.  This  orthonormal  basis  (=  {\/2-]<f>(2~3x  -  k)}kez) 
forms  a  subspace  Vj  C  L2(3ft),  and  the  subspaces  Vj  satisfy  the  containment  hierar- 
chy 

v.cv^vjez.  (5.5) 

Now  we  can  introduce  the  difference  space  Wj  between  the  two  consecutive  sub- 
spaces  Vj  and  Vj_i  such  that  Vj  ©  Wj  =  Vj_x ,  where  ©  denotes  the  direct  sum 
operator.  The  key  idea  of  the  wavelet  basis  is  the  difference  space  Wj  is  spanned  by 
an  orthonormal  basis  which  are  obtained  via  integer  translations  of  the  associated 
wavelet  function  tj>{x),  i.e.  Wj  =  {y/2~^ip(2~3x  —  k)}kez-  Therefore,  a  multiresolu- 
tion orthogonal  decomposition  of  the  space  V0  can  be  achieved  via 

Vj  ©  Wj  ©  Wj.j  ©  •  •  •  ©  W,  =  V0,  (5.6) 
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where  J  is  a  positive  integer.  From  this  multiresolution  orthogonal  space  decompo- 
sition, we  can  construct  a  multiresolution  orthonormal  wavelet  basis,  since  each  of 
the  subspaces  on  the  left  hand  side  of  equation  5.6  are  orthogonal  to  each  other  and 
there  exists  an  orthonormal  basis  for  each  subspace  as  discussed  above. 

The  wavelet  decomposition  and  reconstruction  can  be  efficiently  achieved  via  a 
QMF  (Quadrature  Mirror  Filters)  implementation  as  shown  in  figure  5.1  (see  [50]). 
In  this  figure,  Ajf  is  the  discrete  approximation  of  the  function  f  in  the  subspace 
Vj,  i.e.  Ajf  contains  the  values  of  the  inner  products  of  the  function  f  and  the 
orthonormal  basis  functions  in  Vj,  and  Djf  encodes  the  projection  of  the  function  f 
on  the  subspace  Wj.  The  filters  H  ( in  decomposition)  and  H  ( in  reconstruction)  are 
low-pass  filters,  while  the  filters  G  (  in  decomposition)  and  G  (  in  reconstruction) 
are  high-pass  filters.  They  are  usually  designed  to  be  the  FIR  filters,  which  can 
be  implemented  via  convolution  operations.  With  an  appropriate  choice  of  a  wavelet 
basis,  we  can  find  the  associated  FIR  filters  with  very  short  length/span  and  the  QMF 
filter  implementation  for  the  wavelet  transform  can  be  accomplished  very  efficiently. 

The  QMF  structure  decomposes  the  space  from  one  level  to  the  next  coarser  level 
with  each  stage  involving  a  low-pass  and  high-pass  filtering.  From  a  frequency  domain 
analysis,  we  can  see  that  a  wavelet  decomposition  divides  the  frequency  domain  into 
two  separate  bands  in  each  stage.  Therefore,  each  subspace  Wj  has  its  own  frequency 
band,  which  means  that  all  the  wavelet  basis  functions  in  any  particular  level  are  fully 
contained  in  the  frequency  band  associated  with  that  level.  This  behavior  of  dividing 
the  frequency  domain  [68,  86]  exhibited  by  a  wavelet  decomposition  is  illustrated 
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Figure  5.1.  The  implementation  of  the  wavelet  transform  via  the  QMF  structure 
from  one  level  to  the  next,  (a)  decomposition  (b)  reconstruction,  adopted  from  Mal- 
lat[1989]. 

in  figure  5.2.  Notice  that  the  closer  the  QMF  filters  are  to  the  ideal  low-pass  and 
high-pass  filters,  the  better  it  is  for  achieving  a  good  frequency-domain  division,  i.e., 
the  overlap  between  the  frequency  bands  associated  with  each  level  is  minimized  with 
close  approximations  of  the  ideal  low  and  high-pass  filters. 


«b         21b  4^  8fb 

Figure  5.2.  Division  of  the  frequency  domain  in  a  wavelet  basis. 
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In  [50],  Mallat  constructed  the  2D  wavelet  basis  by  using  the  tensor  products 
of  the  ID  wavelet  and  scaling  functions.   The  QMF  implementation  of  the  wavelet 
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transform  was  also  extended  to  the  two-dimensional  case  in  [50].  The  division  of 
frequency  domain  behavior  for  the  2D  wavelet  basis  is  simply  an  extension  of  the  ID 
case. 

After  introducing  the  division  of  frequency  domain  behavior  for  a  wavelet  basis, 
we  are  ready  to  construct  our  preconditioner  in  a  wavelet  basis  and  adapt  its  spectrum 
to  that  of  the  regularization  filter  discussed  in  section  5.2.2. 

5.2.4     Adaptive  Preconditioning  via  the  Modulation  in  a  Wavelet  Basis 

In  section  5.2.1,  we  discussed  two  criteria  for  selecting  a  good  preconditioner. 
In  this  section,  we  construct  an  adaptive  preconditioner  for  early  vision  problems  by 
following  these  two  criteria. 

Using  techniques  of  frequency  domain  analysis,  a  low-pass  filtering  interpretation 
can  be  given  to  the  regularization  formulation  of  early  vision  problems  as  was  shown  in 
section  5.2.2.  The  spectral  characteristics  of  the  regularization  filter  in  the  frequency 
domain  can  be  approximated  using  the  frequency  domain  subdivision  behavior  of 
the  wavelet  transform  which  can  be  implemented  very  efficiently  using  a  quadrature 
mirror  based  implementation. 

To  apply  a  preconditioner  P  to  a  matrix  K,  the  first  criterion  is  to  drastically 
reduce  the  condition  number  of  P_1K;  the  second  criterion  for  a  good  preconditioner 
is  that  the  auxiliary  linear  system  Pz  =  r  should  be  efficiently  solvable.  An  obvious 
choice  to  reduce  the  condition  number  of  P_1K  is  to  choose  P  =  K,  which  reduces 
the  condition  number  to  1.  Although  only  one  iteration  suffices  to  solve  the  problem, 
the  complexity  of  solving  the  auxiliary  linear  system  Pz  =  r  is  the  same  as  that  of 
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solving  the  original  linear  system.  Therefore,  nothing  is  gained  by  using  P  =  K.  A 
reasonable  choice  for  a  good  preconditioner  is  to  find  a  matrix  P  that  is  spectrally 
close  to  K;  in  addition,  the  solution  to  the  auxiliary  linear  system  Pz  =  r  should  be 
easily  solvable.  Our  preconditioner  is  constructed  based  on  this  very  idea. 

The  matrix  K  is  the  stiffness  matrix  obtained  from  a  discretization  of  the  regu- 
larization  formulation.  Therefore,  its  inverse  has  the  same  spectral  characteristics  as 
that  of  the  regularization  filter  given  in  equation  5.4.  Now  we  approximate  the  spec- 
tral characteristics  of  the  regularization  filter  in  a  wavelet  basis  by  using  its  frequency 
domain  partitioning  property.  This  approximation  is  accomplished  by  modulating 
each  wavelet  basis  function  by  the  spectral  value  of  the  regularization  filter  taken  at 
the  center  of  the  corresponding  frequency  domain  subregion.  To  be  more  specific, 
when  Mallat's  tensor  product  strategy  [50]is  used  for  the  2D  wavelet  transform,  each 
level  (resolution)  contains  three  sets  of  wavelet  basis  functions,  i.e.  LH  (<f>  ®  ip),  HL 
(*l>®(f>),  and  HH  (0®^>),  and  each  set  corresponds  to  its  own  subregion  in  the  2D  fre- 
quency domain.  The  modulation  for  the  wavelet  basis  functions  in  each  set  is  based 
on  the  spectral  function  of  the  regularization  filter  in  its  corresponding  subregion. 
For  ease  of  implementation,  we  can  just  take  the  spectral  function  value  at  the  center 
of  the  subregion  to  be  the  modulation  factor. 

The  inverse  of  the  stiffness  matrix  is  basically  a  low-pass  filter,  whose  spectral 
function  changes  rapidly  in  the  lower  frequency  region  and  varies  smoothly  in  the 
higher  frequency  region.  An  approximation  to  the  spectral  function  of  the  low-pass 
filter  achieved  in  a  wavelet  basis  is  very  fine  in  the  lower  frequency  regions  and  coarse 
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in  the  higher  frequency  regions.  This  is  because  the  division  of  the  frequency  domain 
becomes  finer  in  the  lower-frequency  band  with  increasing  levels  (see  figure  5.2). 
Therefore,  we  can  achieve  a  better  approximation  when  more  levels  are  used.  In  our 
preconditioner,  the  number  of  levels  used  is  determined  by  the  support  of  the  wavelet 
basis  used  and  the  problem  size.  Therefore,  for  a  given  problem  size  and  wavelet 
basis,  the  number  of  levels  is  fixed. 

Our  preconditioner  P  can  be  written  as 

P  =  WMWT,  (5.7) 

where  the  matrix  W  is  a  2D  orthonormal  wavelet  transformation  matrix  and  the 
matrix  M  is  the  diagonal  modulation  matrix,  whose  diagonal  values  are  obtained 
from  the  spectral  function  of  the  regularization  filter.  From  equation  5.7,  we  can 
see  that  the  solution  to  the  auxiliary  linear  system  Pz  =  r  in  the  preconditioned 
conjugate  gradient  can  be  computed  very  efficiently  when  the  wavelet  basis  are  chosen 
appropriately,  i.e.  the  QMF  filters  are  of  short  length/span.  As  long  as  the  QMF 
filters  are  of  finite  length,  the  solution  can  be  obtained  in  O(N)  operations.  The 
solution  to  this  auxiliary  linear  system  is  obtained  in  the  three  steps  namely,  (1)  take 
the  wavelet  transform  of  the  vector  r,  (2)  modulate  the  transformed  vector  using  the 
matrix  M,  and  (3)  take  the  inverse  wavelet  transform. 


96 

5.3     Adaptive  Preconditioned  Conjugate  Gradient  Algorithm 

As  discussed  in  section  2.3,  the  discretization  of  the  variational  formulation  leads 
to  solving  the  system  of  equations  Kx  =  b.  The  stiffness  matrix  K  is  symmetric 
positive  definite  and  therefore,  the  conjugate  gradient  algorithm  can  be  used  to  solve 
the  problem  guaranteeing  a  unique  solution.  When  a  preconditioner  P  is  used  to 
accelerate  the  convergence,  we  have  what  is  known  as  the  preconditioned  conjugate 
gradient  algorithm  (see  [26])  given  as  follows. 

0.  Choose  an  initial  x0,  and  compute  r0  =  b  —  Kx0;  set  k  =  0. 

1.  Solve  Pzjt  =  rk;  k  =  k  +  1. 

2.  If  k  =  1,  set  pi  =  z0;  else  compute  /3f  =  rf_2zfc_2  and  /3k  =  a£_i/#P; 

update  pfc  =  Zk-i  +  PkPk-i- 

3.  Compute  af  =  ifljSfc-i,  of  =  p£Kpfc,  and  ak  =  aj^/af ; 

4.  Update  rk  =  rfc_i  -  akpk. 

5.  Update  Xjt  =  Xjt-i  +  akpk. 

6.  If  rk  ~  0,  stop;  else  go  to  step  1. 

The  only  difference  between  the  preconditioned  conjugate  gradient  algorithm 
and  the  standard  conjugate  gradient  algorithm  is  the  preconditioning  process  in  step 
1.  Since  the  preconditioning  involves  solving  the  auxiliary  linear  system  Pz^  =  r^  in 
each  iteration,  it  is  absolutely  necessary  to  have  a  fast  solution  to  this  preconditioning 
equation  when  designing  a  good  preconditioner. 
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The  strategy  for  constructing  our  adaptive  preconditioner  for  the  early  vision 
problems  with  regularization  formulation  was  discussed  in  section  5.2.  Our  precon- 
ditioner is  based  on  the  idea  of  modulation  in  a  wavelet  basis  to  approximate  the 
spectral  characteristics  of  the  matrix  K.  Accordingly,  our  adaptive  wavelet  precondi- 
tioner P  took  the  form  WMWr  with  the  diagonal  modulation  matrix  M  encoding 
the  spectral  approximation  in  the  wavelet  basis.  Now,  the  primary  problem  in  this 
preconditioner  construction  is  how  to  construct  the  modulation  matrix  M  for  general 
problems.  In  the  following  paragraphs,  we  show  how  to  achieve  the  construction  of 
an  adaptive  preconditioner  for  the  surface  reconstruction,  shape  from  shading,  and 
optical  flow  computation  problems. 

5.3.1     Surface  Reconstruction 

For  the  surface  reconstruction,  the  stiffness  matrix  K(=  Kj  +  K,)  contains  the 
matrix  K<*  obtained  from  the  data  constraint  and  the  matrix  Ks  corresponding  to  the 
smoothness  constraint.  When  the  controlled-continuity  stabilizer  is  used,  the  matrix 
K,  is  the  linear  combination  of  the  discrete  Laplacian  matrix  (  from  membrane)  and 
the  discrete  biharmonic  matrix  (  from  thin  plate).  Since  the  spectral  distributions  for 
both  matrices  are  known  a  priori,  the  spectral  distribution  for  the  matrix  Ks  can  be 
obtained  from  their  linear  combination.  Now,  what  remains  to  be  determined  is  the 
spectral  distribution  for  the  matrix  Kj.  For  the  dense  data  case  with  uniform  weights 
for  each  data  constraint,  the  matrix  Kj  is  a  scaled  identity  matrix,  whose  spectral 
distribution  is  a  constant  function.  This  case  corresponds  to  the  discrete  version 
of  the  linear  shift-invariant  low-pass  filtering  interpretation  for  the  regularization 
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problem.  For  other  cases  such  as  sparse  data  surface  reconstruction  or  dense  data 
with  nonuniform  weights,  we  somehow  need  to  estimate  the  spectral  distribution  of 
Kd  in  the  frequency  domain.  We  will  elaborate  further  on  this  spectral  estimation  in 
the  following  paragraphs. 

Since  the  spectral  approximation  of  the  matrix  K  is  achieved  in  the  wavelet 
basis,  we  can  approximate  the  spectral  distribution  of  the  matrix  Kj  via  the  wavelet 
transform.  One  way  is  to  do  the  wavelet  transform  of  the  matrix  Kj  and  then  compute 
the  energy  contained  in  each  diagonal  block  which  corresponds  to  a  particular  division 
of  the  frequency  band,  and  the  computed  energy  is  used  to  represent  the  spectral  value 
for  this  frequency  band.  To  use  this  spectral  estimation,  we  need  to  develop  a  fast 
computational  algorithm  since  the  wavelet  transform  of  an  N  x  N  matrix  requires 
0(N2)  operations.  Another  way  to  estimate  the  spectral  distribution  of  Kj  is  to 
take  the  wavelet  transform  of  its  diagonal  entries  only.  The  N  diagonal  entries  of  the 
N  x  N  diagonal  matrix  Kj  are  grouped  into  an  n  x  n  matrix.  Then  we  compute  the 
energy  from  the  wavelet  transform  of  this  matrix  and  associate  it  with  the  spectral 
value  of  each  frequency  band.  This  method  takes  advantage  of  the  diagonal  structure 
of  the  matrix  Kj  and  can  be  accomplished  in  0(N)  operations.  We  use  this  method 
to  estimate  the  spectrum  of  the  matrix  Kj  in  our  experiment. 

Using  the  above  discussed  spectral  approximation  to  the  matrices  Ks  and  Kj 
separately  in  a  wavelet  basis,  we  can  obtain  the  diagonal  approximation  matrices  Ms 
and  Mj,  respectively.  Then,  the  modulation  matrix  M  is  taken  to  be  the  sum  of 
Ms  and  Mj.  Since  the  matrix  Mj  is  obtained  from  the  spectral  approximation  and 


99 


the  matrix  Ma  has  very  small  diagonal  entries  at  the  very  coarse  resolution  levels, 
the  spectral  estimation  error  in  Md  at  these  levels  may  cause  significant  error  in  the 
spectral  approximation  to  the  matrix  K.  This  problem  may  exist  only  in  the  case  of 
sparse  data  surface  reconstruction,  since  the  spectral  estimation  for  Kd  in  the  dense 
data  case  is  quite  accurate. 

To  amend  the  problem  caused  by  the  spectral  estimation  error,  we  introduce  a 
weighting  function  <r(i),  where  i  is  the  number  of  iterations,  to  appropriately  weight 
the  importance  of  Ms  and  Md  during  the  CG  iterations,  i.e., 

M(i)  =  Ma  +  <r(i)Md.  (5.8) 

The  matrix  M(i)  is  the  modulation  matrix  in  the  i-th  iteration.  From  the  filtering 
correspondence  with  the  regularization  formulation,  we  can  see  that  the  overall  spec- 
tral characteristics  for  the  matrix  K_1  is  a  low-pass  filter.  Note  that  the  inverse  of 
K,  is  also  a  low-pass  filter,  while  the  inverse  of  Kd  is  not.  Therefore,  the  matrix 
Ms  (  or  Ks  )  is  more  dominant  than  the  matrix  Md  (  or  Kd  )  for  solving  the  linear 
system  Kx  =  b.  Based  on  this  observation,  we  set  the  weighting  function  cr(i)  to  be 
very  small  initially  and  then  gradually  increase  the  values  of  the  weighting  function. 
Using  this  schedule  for  the  weighting  function,  the  lower  frequency  components  of 
the  solution  are  recovered  at  the  beginning  of  the  preconditioned  conjugate  gradient 
iterations,  and  then  the  higher  frequency  components.  This  global-to-local  shape  re- 
covery phenomenon  can  be  predicted  by  the  choice  of  the  weighting  function  cr(i).  In 
fact,  the  weighting  function  can  be  adapted  during  the  iterations  instead  of  following 
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a  preset  schedule.  The  adaptation  concept  is  similar  to  the  adaptive  step  scheme  in 
the  scale  space  tracking  [89]. 

The  preconditioner  construction  for  the  shape-from-shading  and  the  optical  flow 
computation  problems  is  very  similar  to  the  one  outlined  here  for  the  surface  recon- 
struction problem.  We  only  state  their  differences  in  the  following  paragraphs. 

5.3.2     Shape  from  Shading 

The  variational  formulation  of  the  SFS  problem  leads  to  solving  a  set  of  nonlin- 
ear partial  differential  equations.  After  discretization,  we  need  to  solve  the  system 
Kx  =  b.  If  the  linear  approximation  of  the  reflectance  function  technique  [38]  is  used 
to  improve  the  convergence,  the  stiffness  matrix  K  €  ^Nx3N  has  the  following  form. 


K  = 


ftK. 

/xI®D 

fxD®I 

/zI<g>Dr 

AK 

,  +  (Rp  +  p)I 

RpRql 

^DT®I 

RpRql 

AK 

,  +  (Rq  +  n)l 

(5.9) 


where  Ks  G  %{NxN  is  the  Laplacian  matrix,  I  €  $lNxN  is  the  identity  matrix,  Rp  and 
Rq  are  the  first-order  partial  derivatives  of  the  reflectance  function  with  respect  to  p 
and  q,  respectively,  taken  at  the  reference  gradient  (p0,  q0),  and  the  matrix  D  G  3fjnXn 
(JV  =  n2)  has  nonzero  entries  only  along  the  main  diagonal  and  upper  diagonal  with 


Is  along  the  main  diagonal  and  —Is  along  the  upper  diagonal,  i.e., 


1     -1      0 


0      1-10 


D  = 


0 
0 


•    •••      1      -1 

.   ...    0     0 
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(5.10) 


Our  preconditioner  P  has  similar  block  structure  as  K  and  can  be  expressed  as 


P  = 


WOO 


0     W     0 


0      0     w 


Mi  0  0 
0  M2  0 
0       0      M3 


Wr  0  0 
0  WT  0 
0        0      WT 


(5.11) 


The  modulation  matrices  Mi,  M2  and  M3  are  all  diagonal  and  they  are  used  to  mod- 
ulate the  wavelet  basis  functions  corresponding  to  z,  p,  and  q  vectors,  respectively. 
The  modulation  matrices  are  constructed  to  approximate  the  spectral  distributions 
of  the  corresponding  diagonal  blocks  in  the  matrix  K,  i.e.  /xK4,  AKS  +  (Rp  +  //)I  and 
AKS  +  (Rq  +  fi)I.  The  construction  procedure  for  each  block  is  the  same  as  discussed 
in  the  surface  reconstruction  problem. 


102 


5.3.3     Optical  Flow  Computation 

Similar  to  the  SFS  problem,  the  stiffness  matrix  K  for  the  optical  flow  estimation 
has  the  following  2x2  block  structure. 


K  = 


K.  +  Exx        E 


*» 


Exy        Ks  +  E 


yy 


(5.12) 


where  Exx,  E™,  and  Eyy  are  all  diagonal  matrices  with  the  normalized  entries  -i  J_ a , 


Ex+Ey 


El 


%**%,  and  _/'\j.  The  construction  of  the  block  preconditioner  is  similar  to  that  in 


Ex+Ey 


the  SFS  problem,  therefore  it  is  omitted  here. 

5.4     Experimental  Results 

In  this  section,  we  present  implementation  results  of  applying  our  adaptive  pre- 
conditioner to  the  surface  reconstruction,  shape-from-shading,  and  optical  flow  com- 
putation problems.  Since  our  preconditioner  adapts  to  the  problem,  it  is  bound  to 
perform  better  than  the  existing  fixed  preconditioners  described  in  literature.  We 
will  justify  this  claim  through  the  experimental  results. 

Note  that  there  is  no  restriction  on  the  wavelet  basis  being  used  in  the  construc- 
tion of  our  preconditioner.  However,  the  efficiency  of  our  preconditioner  is  related 
to  the  localization  of  the  wavelet  being  used.  The  time  localization  makes  the  com- 
putation of  the  wavelet  transform  efficient;  the  localization  in  the  frequency  domain 
yields  a  more  accurate  spectral  approximation  and  therefore  faster  convergence  of  the 
preconditioned  conjugate  gradient  can  be  achieved.  In  our  experiment,  we  use  the 
cubic  B-spline  wavelet  given  in  [50]  with  the  QMF  filters  truncated  to  a  length  of  11. 
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This  truncation  is  used  to  reduce  the  operations  involved  in  QMF  implementation 
of  the  wavelet  transform  and  an  insignificant  error  is  introduced  in  the  frequency 
domain. 

In  constructing  our  preconditioner,  the  weighting  function  cr(i)  is  needed  to 
control  the  contributions  of  Ka  and  K<f  to  the  modulation  matrix.  We  suggest  that 
the  weighting  function  be  chosen  as  an  increasing  function.  We  used  an  increasing 
function  of  the  form  (l-e"c)d,  where  c  and  d  are  constants,  for  the  weighting  function. 
In  our  experiment,  c  was  usually  set  to  10  (for  problems  not  severely  ill-conditioned) 
or  100  (for  severely  ill-conditioned  problems),  and  accordingly  d  was  set  to  be  1  or  2. 

5.4.1     Surface  Reconstruction 

We  present  results  of  several  experiments  with  sparse  data,  and  compare  the  con- 
vergence rate  of  our  adaptive  preconditioned  conjugate  gradient  (APCG)  algorithm 
with  the  hierarchical  basis  conjugate  gradient  (HCG)  technique  [75],  the  conjugate 
gradient  (CG)  algorithm  [26],  and  the  bi-orthogonal  wavelet  basis  transfer  scheme 
with  Jaffard's  diagonal  preconditioning  [42].  In  [92],  the  biorthogonal  wavelet  basis 
transfer  is  used  in  the  preconditioning  process.  We  found  that  this  wavelet  basis 
transfer  must  be  combined  with  appropriate  scaling  to  improve  the  convergence, 
which  was  not  mentioned  in  [92].  In  [42],  Jaffard  proposed  the  2J,  where  j  is  the 
level,  diagonal  scaling  in  the  wavelet  basis  as  a  preconditioning  technique.  Jaffard's 
2J  diagonal  scaling  can  be  changed  to  2mj  scaling  where  m  is  a  positive  integer.  We 
combined  this  variant  of  Jaffard's  diagonal  scaling  with  the  bi-orthogonal  wavelet  ba- 
sis transfer  and  implemented  it  for  conducting  the  convergence  comparisons.  We  use 
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Figure  5.3.    The  sparse  data  sets  used  for  the  surface  reconstruction,  (a)  synthetic 
data  (b)  synthetic  symmetric  data  (c)  real  laser  range  data. 
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all  of  the  aforementioned  preconditioned  in  conjunction  with  a  conjugate  gradient 
algorithm  and  test  their  performance  on  one  synthesized  sparse  data  set  (see  fig- 
ure 5.3)  in  the  surface  reconstruction  experiments  and  one  real  range  data  set  from  a 
laser  scanned  ball.  Subsequent  experiments  depict  the  convergence  comparison  of  the 
aforementioned  preconditioners  applied  to  the  SFS,  and  OF  computation  problems. 
In  the  first  experiment,  we  recover  the  underlying  surface  from  the  sparse  data 
set  in  figure  5.3(a)  using  the  controlled-continuity  stabilizer  with  the  parameters  p 
and  r  being  1  and  0.1,  respectively.  The  size  of  the  problem  is  64  x  64.  Applying 
our  adaptive  preconditioner  to  this  problem,  we  can  see  that  the  iterations  converge 
to  the  true  solution  very  fast.  The  computed  solutions  after  1,  5,  10  and  20  steps 
of  conjugate  gradient,  HCG,  and  our  APCG  algorithms  are  illustrated  in  figure  5.4. 
With  only  five  iterations  of  our  APCG  algorithm,  the  global  shape  of  the  surface 
is  recovered.  After  ten  iterations  of  our  algorithm,  the  solution  is  very  close  to  the 
true  solution.  Comparing  these  intermediate  solutions  in  figure  5.4,  we  can  see  the 
solution  after  20  iterations  of  HCG  is  still  bumpy  and  the  computed  solution  after 
20  CG  iterations  is  far  from  the  true  solution.  However,  this  comparison  is  not 
very  fair  since  the  computational  cost  for  each  iteration  of  the  APCG,  HCG  and 
CG  algorithms  is  different.  Consequently,  we  include  the  execution  time  into  the 
comparison  as  follows.  On  a  multi-user  SUN  SPARC- 10  workstation,  10  iterations 
of  our  APCG  algorithm  on  this  experiment  took  2.14  seconds.  For  the  same  time, 
the  HCG  algorithm  ran  25  iterations  and  the  CG  algorithm  ran  30  iterations.  Figure 
5.5  shows  these  computed  solutions  for  the  three  different  algorithms  using  the  same 


106 


time.  We  can  see  that  our  APCG  algorithm  gives  the  best  result  for  this  comparison 
with  the  same  execution  time.  In  figure  5.6(a),  we  depict  the  convergence  rate  of 
the  APCG  algorithm  in  comparison  to  all  the  aforementioned  iterative  algorithms. 
Figure  5.6(b)  depicts  the  experiments  with  a  thin  plate  stabilizer.  The  convergence 
curves  for  our  APCG  algorithm,  HCG,  CG,  and  Yaou  &  Chang's  bi-orthogonal  basis 
transfer  with  Jaffard's  diagonal  preconditioning  are  shown.  As  can  be  seen,  our 
APCG  algorithm  has  the  best  convergence  performance  in  both  experiments. 

We  include  the  discontinuities  in  the  above  controlled-continuity  stabilizer  to 
test  the  performance  of  our  APCG  algorithm.  The  discontinuities  are  located  along  a 
pre-specified  diagonal  line.  Figure  5.7  illustrates  the  convergence  rates  of  our  APCG, 
HCG(  with  4  levels)  and  CG  algorithms.  The  APCG  algorithm  still  has  the  best 
convergence  performance  in  this  experiment.  Since  the  inclusion  of  discontinuities 
makes  the  conditioning  of  the  stiffness  matrix  worse  and  our  preconditioner  was  not 
redesigned  to  account  for  the  discontinuities  in  the  stabilizer,  the  convergence  rate  in 
this  experiment  is  slower  than  that  of  APCG  without  discontinuities.  The  computed 
solutions  after  10,  20,  40  and  60  iterations  of  our  APCG  algorithm  are  depicted  in 
figure  5.8. 

We  attempted  to  incorporate  the  preconditioner  developed  in  Pentland's  [61] 
into  the  comparison  however,  we  found  that  the  convergence  of  conjugate  gradient 
with  this  preconditioner  is  even  slower  than  that  of  the  CG  algorithm  (without  pre- 
conditioning) for  the  very  sparse  data  case.  Note  that  the  preconditioner  was  used 
in  a  gradient  decent  iteration  in  [61].    In  our  experiments,  the  preconditioners  are 
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Figure  5.4.  (a),  (d),  (g)  &  (j)  are  the  computed  solutions  using  the  APCG  algorithm, 
(b),  (e),  (h)  &  (k)  are  the  computed  solutions  using  HCG  algorithm,  (c),  (f),  (i)  & 
(1)  are  the  computed  solutions  using  conjugate  gradient  algorithm,  after  1,5,  10  and 
20  iterations,  respectively. 
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(a) 


(b) 


(c) 


Figure  5.5.  The  computed  solutions  after  (a)  10  iterations  of  APCG  (b)  25  iterations 
of  HCG  (c)  30  iterations  of  CG  using  the  same  time. 
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Figure  5.6.  The  convergence  for  different  preconditioners  in  the  preconditioned  con- 
jugate gradient  algorithm  in  the  surface  reconstruction  problem  with  the  synthetic 
sparse  data  set  and  (a)  with  controlled-continuity  stabilizer  (p  =  1,  r  =  0.1)  (b)  with 
thin-plate  stabilizer. 
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Figure  5.7.  The  convergence  for  different  preconditioners  in  the  preconditioned  con- 
jugate gradient  algorithm  for  the  surface  reconstruction  problem  with  discontinuities 
specified  in  the  controlled-continuity  stabilizer. 
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Figure  5.8.  The  computed  solutions  for  the  surface  reconstruction  with  discontinu- 
ities using  the  APCG  algorithm  after  (a)  10,  (b)  20,  (c)  40  and  (d)  60  iterations, 
respectively. 
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Figure  5.9.  The  convergence  of  Pentland's  preconditioner  in  conjunction  with  the  con- 
jugate gradient  algorithm  is  worse  than  that  of  conjugate  gradient  algorithm  without 
preconditioning  for  the  surface  reconstruction  problem  with  the  synthetic  sparse  data 
set  using  a  membrane  stabilizer. 

used  in  conjunction  with  the  conjugate  gradient  algorithm,  which  is  known  to  be 
much  more  efficient  than  a  gradient  decent  algorithm  [26].  We  use  the  sub-sampled 
data  set  in  figure  5.3(a)  with  a  membrane  stabilizer  in  a  32  x  32  problem  for  testing 
this  preconditioner.  The  convergence  curves  for  Pentland's  preconditioner  with  CG, 
CG  and  our  APCG  algorithms  are  shown  in  figure  5.9.  We  can  see  that  Pentland's 
preconditioning  is  worse  than  the  conjugate  gradient  iterations  without  precondition- 
ing. This  is  because  the  diagonal  dominance  behavior  of  Ks  after  wavelet  transform 
claimed  in  [61]  doesn't  hold  in  general.  Pentland's  preconditioner  was  constructed 
by  approximating  the  wavelet  transform  of  the  stiffness  matrix  by  its  diagonal  en- 
tries based  on  the  diagonal  dominance  assumption  of  Ks.  The  falsity  of  the  diagonal 
dominance  claim  is  evident  in  figure  5.10  where  we  profile  arbitrarily  selected  rows  of 
the  wavelet  transformed  membrane  matrix  Ks  of  size  (64  x  64)  for  the  ID  problem. 
In  [61],  the  preconditioner  was  chosen  as  the  diagonal  of  the  stiffness  matrix  K  in 
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Figure  5.10.  Row  profile  of  the  wavelet  transformed  membrane  matrix  Ks  of  size 
64  x  64  along  different  rows,  (a)  row  3  (b)  row  10  (c)  row  30  (d)  row  50.  This  figure 
establishes  the  falsity  of  the  diagonal  dominance  behavior  of  K,  claimed  in  [19]. 
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the  wavelet  basis.  Note  that  for  the  sparse  data  case,  the  matrix  Ka  dominates  Kj. 
Therefore,  bad  approximations  to  the  wavelet  transform  of  Ks  will  deteriorate  the 
preconditioning  effect  and  slow  down  the  convergence  speed. 


-I r— 


m  :  Biorthogonal  wavelet  transform  with  power  of  2  scaling 


Figure  5.11.  The  convergence  for  different  preconditioners  in  the  preconditioned  con- 
jugate gradient  algorithm  for  the  surface  reconstruction  problem  with  sparse  sym- 
metric data  set  using  a  controlled-continuity  stabilizer. 


The  third  experiment  uses  the  sparse  data  set  given  in  figure  5.3(b).  The  size 
of  the  problem  is  maintained  at  64  x  64.  The  controlled-continuity  stabilizer  with 
p  =  1  and  r  =  0.01  is  used  as  the  smoothness  constraint  here.  Note  that  the 
data  set  is  symmetric  with  respect  to  the  center.  Using  the  circular  convolution  in 
the  QMF  implementation  will  speed  up  the  convergence  rate  of  the  wavelet  based 
preconditioning  algorithm  in  this  case.  This  is  because  the  wavelet  preconditioner 
with  circular  convolution  operates  on  the  residue  vector  periodically  in  2D.  This 
is  very  helpful  for  symmetric  data  as  the  solution  is  supposed  to  be  symmetric  in 
this  case.  Yaou  &  Chang  [92]  used  the  circular  convolution  in  their  wavelet  basis 
transfer  and  most  of  their  experiments  are  performed  on  the  symmetric  data  sets 
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Figure  5.12.  The  computed  solutions  using  the  adaptive  preconditioned  conjugate 
gradient  algorithm  after  (a)  10  iterations,  (b)  40  iterations,  (c)  100  iterations,  in  the 
third  surface  reconstruction  experiment. 
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or  on  the  problems  with  the  solutions  which  are  somehow  periodically  continuous. 
We  have  implemented  both  the  circular  convolution  and  the  linear  convolution  (with 
self-reflection  at  the  border  [50])  in  the  QMF  for  our  adaptive  preconditioner  in 
this  experiment.  The  convergence  curves  for  a  comparison  of  the  aforementioned 
algorithms  are  plotted  in  figure  5.11.  Our  preconditioner  with  the  linear  as  well  as 
periodic  convolutions  has  the  best  convergence  performance.  The  computed  solutions 
after  10,  40  and  100  iterations  are  shown  in  figure  5.12. 

The  other  data  set  in  our  surface  reconstruction  experiments  is  the  real  laser 
range  data  from  a  ball.  Figure  5.3(c)  shows  the  148  randomly  sampled  data  points 
from  the  original  range  data  set  generated  by  the  MSU  (Michigan  State  University) 
Pattern  Recognition  and  Image  Processing  Lab's  Technical  Arts  100X  scanner.  The 
size  of  the  problem  domain  is  152  x  193.  The  controlled-continuity  stabilizer  with 
p  —  1  and  t  =  0.1  was  used  in  this  experiment.  The  discontinuities  are  specified 
along  the  boundary  of  the  object.  The  discontinuities  divide  the  problem  domain 
into  two  non-overlapped  regions,  where  the  inside  region  corresponds  to  the  object 
surface  and  the  outside  region  corresponds  to  uniform  background.  Our  adaptive 
preconditioning  is  used  in  the  inside  region  only  by  enforcing  the  solution  values 
in  the  outside  region  to  be  zero  after  the  preconditioning.  The  convergence  curves 
for  our  APCG,  HCG  (with  different  levels),  and  CG  algorithms  are  shown  in  figure 
5.13(a).  Our  adaptive  preconditioned  CG  algorithm  still  has  the  best  convergence 
performance  in  this  experiment  with  real  laser  range  data.  Very  accurate  solution  is 
obtained  after  10  iterations  of  the  APCG  algorithm,  as  shown  in  figure  5.13(b). 
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Figure  5.13.  (a)  The  convergence  for  the  APCG,  HCG  and  CG  algorithms  for  the 
surface  reconstruction  problem  with  the  real  laser  range  data  set  using  a  controlled- 
continuity  stabilizer,  (b)  The  computed  solution  after  10  APCG  iterations. 
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5.4.2     Shape  from  Shading 

For  the  shape-from-shading  experiment,  we  use  the  synthetic  bump  images  with 
size  64  x  64  used  in  [76,  38],  as  shown  in  figure  5.14.  These  two  images  are  generated 
from  the  Lambertian  reflectance  model  with  the  light  source  directions  (ps,  qs,  1)  being 
(0.34,0.3467,1)  and  (-0.34,0.3467,1)  for  the  left  and  right  images,  respectively.  In 
this  experiment,  only  the  left  image  is  taken  as  the  input  to  the  SFS  algorithm 
to  recover  the  shape,  and  the  right  image  is  used  to  monitor  the  closeness  of  the 
computed  solution  to  the  true  solution. 


(a) 


0>) 


Figure  5.14.  Shaded  images  of  bump  with  different  light  source  directions. 


The  construction  of  our  preconditioner  for  the  shape-from-shading  problem  is 
described  in  the  previous  section.  The  parameters  A  and  y,  are  set  to  be  0.5  in 
this  experiment.  Without  including  any  boundary  condition  in  the  energy  function 
to  be  minimized,  we  compare  the  convergence  of  our  APCG  algorithm  with  the 
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Figure  5.15.  The  convergence  of  the  APCG,  HCG,  and  CG  algorithms  for  the  shape- 
from-shading  problem. 


(a) 


(b) 


Figure  5.16.  Shaded  images  of  bump  after  100  iterations  of  APCG  algorithm  without 
using  any  boundary  condition. 
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HCG  and  CG  algorithms.  The  convergence  curves  for  the  energy  function  and  the 
computed  (p,  q)  error  are  shown  in  figure  5.15.  Note  that  the  (p,  q)  error  is  defined  as 
£  |p»- P*l  +  k»  — 9*li  where  (p,-,g.)  and  {p*,q*)  are  the  computed  and  true  (p,q)  at  the 
i-th  location,  respectively.  We  can  see  the  APCG  algorithm  has  the  best  convergence 
behavior  in  figure  5.15(a)  and  (b).  The  computed  solution  after  100  iterations  of  the 
APCG  algorithm  is  fed  back  to  the  reflectance  model  to  generate  the  left  and  right 
bump  images  as  shown  in  figure  5.16.  Comparing  these  computed  images  to  those  in 
figure  5.14,  we  can  see  that  this  computed  solution  is  far  from  the  true  solution  in 
the  upper  left  region,  although  its  corresponding  energy  is  close  to  0.  This  is  because 
the  energy  function  for  the  SFS  problem  is  nonconvex  and  gradient-based  methods 
(e.g.  CG-type  methods)  tend  to  get  trapped  in  local  minima.  To  obtain  the  true 
solution  using  the  gradient-based  method,  we  need  to  have  an  appropriate  initial 
guess  that  is  closer  to  the  true  solution.  But  it  is  difficult  to  find  such  initial  guess 
that  is  guaranteed  to  converge  to  the  true  solution  in  the  CG-type  algorithms.  In 
fact,  we  always  use  0  as  the  initial  guess  in  our  experiments. 

To  resolve  the  above  problem  for  minimizing  the  SFS  energy  function  by  gradient- 
based  methods,  we  include  the  boundary  conditions  obtained  from  the  occluding 
boundary  [41]  as  an  additional  data  constraint  in  the  energy  function  to  be  mini- 
mized. By  incorporating  the  boundary  conditions  into  the  SFS  energy  function,  the 
CG-type  algorithms  can  approach  the  true  solution.  In  our  experiment,  we  use  the 
active  contour  model  [43]  to  locate  the  occluding  boundary  at  first.  By  using  the 
property  that  the  surface  normal  at  a  point  on  the  occluding  boundary  is  parallel 
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Figure  5.17.  Shaded  images  of  bump  after  (a)  &  (b)  10  iterations,  (c)  &  (d)  20 
iterations,  (e)  &  (f )  40  iterations  of  the  APCG  algorithm  with  the  occluding  boundary 
condition. 
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to  the  normal  of  the  silhouette  in  the  image  plane  [41],  we  impose  the  Direchelet 
boundary  condition  on  (p,q)  along  the  occluding  boundary.  In  addition,  we  use 
the  occluding  boundary  to  incorporate  the  discontinuities  in  (p,  q)  or  z  representing 
orientation  discontinuities  or  depth  discontinuities  respectively.  In  this  experiment, 
we  incorporate  the  orientation  discontinuities  along  the  occluding  contour.  Figure 
5.17  shows  the  left  (north  west)  and  right  (north  east)  images  generated  from  the 
computed  solutions  for  our  APCG  algorithm  after  10,  20  and  40  iterations.  We  can 
see  that  the  computed  solutions  approach  to  the  true  solution  after  incorporating 
the  boundary  conditions.  The  computed  gradient  and  height  after  40  iterations  are 
shown  in  figure  5.18. 
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Figure  5.18.    The  computed  (a)  gradient  and  (b)  height  after  40  iterations  of  the 
APCG  algorithm. 


The  convergence  of  our  APCG  algorithm  was  compared  to  the  HCG  and  CG 
algorithms  for  the  SFS  problem  with  the  Direchelet  boundary  condition.  The  con- 
vergence curves  for  the  energy,  p  —  q  error  and  z  error  reduction  are  shown  in  figure 
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Figure  5.19.   The  convergence  curves  of  the  (a)  energy,  (b)  p  —  q  error,  (c)  z  error 
for  the  CG,  HCG  and  APCG  algorithm. 
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5.19.  Our  APCG  outperforms  the  other  methods  in  terms  of  convergence  speed  for 
all  of  the  three  different  measures. 

5.4.3     Optical  Flow  Computation 

In  section  2.3,  we  presented  a  modified  gradient-based  formulation  of  the  opti- 
cal flow  computation  problem.  Our  regularization-based  formulation  combines  the 
region-based  [40]  and  contour-based  [35]  gradient  constraints.  The  gradient  con- 
straint for  the  brightness  function  E  is  active  all  over  the  image  plane  except  in  the 
the  neighborhood  of  the  discontinuity  locations,  while  the  gradient  constraint  for  the 
brightness  function  after  the  Laplacian  of  Gaussian  operation,  i.e.  V2G*  E,  is  active 
at  the  discontinuity  locations.  In  our  implementation,  we  locate  the  zero-crossings  of 
V2G*E  and  label  them  as  discontinuities  [35].  Accurate  estimation  of  Ex,  Ey  and  Et 
is  necessary  to  obtain  a  reliable  flow  constraint.  When  E  is  the  brightness  function, 
we  use  the  Gaussian  spatial-temporal  pre-smoothing  prior  to  applying  the  two-point 
central  difference  to  approximate  the  partial  derivatives.  When  E  is  V2G  *  E,  a 
two-point  central  difference  is  applied  to  E  to  approximate  its  partial  derivatives.  As 
discussed  in  section  2.3,  a  normalization  of  the  gradient  constraints  is  used  to  make 
the  contribution  of  each  (active)  flow  constraint  uniform. 

We  present  two  experiments  for  the  optical  flow  computation.  The  first  experi- 
ment consists  of  a  synthetic  translating  square  image  sequence;  and  the  second  is  a 
real  image  sequence  consisting  of  a  rotating  Rubic  cube.  One  frame  for  each  image 
sequence  is  shown  in  figure  5.20  .  Each  frame  for  the  translating  square  and  rotating 
Rubic  cube  image  sequence  is  of  size  100  x  100  and  256  x  240  respectively. 
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(a) 


(b) 


Figure  5.20.  One  frame  from  the  (a)  square  image  sequence  and  (b)  rotating  Rubic 
cube  image  sequence. 

The  construction  of  our  adaptive  preconditioner  for  the  optical  flow  problem  is 
described  in  section  5.3.  This  preconditioner  is  embedded  in  the  conjugate  gradient 
to  accelerate  the  convergence  speed  for  solving  the  linear  system  in  the  optical  flow 
computation.  We  use  the  CG,  HCG  (  with  various  number  of  levels),  and  our  APCG 
algorithms  to  solve  the  problem.  Figure  5.22(a)  and  (b)  shows  the  convergence  rates 
of  the  three  algorithms  for  the  translating  square  and  the  rotating  Rubic  examples, 
respectively.  We  can  see  that  our  APCG  algorithm  has  the  best  performance  in 
convergence.  Note  that  the  HCG  algorithm  also  converges  quite  fast  because  the 
linear  system  for  the  optical  flow  problem  is  not  as  ill-conditioned  as  the  one  in  the 
surface  reconstruction  and  shape-from-shading  problems.  Therefore,  the  difference 
between  various  preconditioning  methods  in  not  as  pronounced  for  the  optical  flow 
problem  as  in  the  other  two  (SR  k  SFS)  problems.   Although  the  HCG  algorithm 
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with  an  appropriate  number  of  levels  L,  exhibits  fast  convergence,  it  is  difficult  to 
determine  the  best  L  for  a  problem  in  advance.  As  for  the  APCG  algorithm,  the 
number  of  levels  is  always  chosen  to  be  the  maximum  possible  which  allows  the  linear 
convolution  in  the  QMF  implementation  with  the  mirror  reflection  at  the  border  to 
be  carried  out  with  ease. 
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Figure  5.21.    Computed  flow  of  the  (a)  translating  square  and  (b)  rotating  Rubic 
cube. 


In  our  examples,  the  regularization  parameter  A  was  set  to  a  value  of  10  for  the 
translating  square  example  and  1  for  the  rotating  Rubic  cube  experiment.  For  the 
former,  the  computed  optical  flows  after  10  iterations  of  the  APCG  algorithm  is  shown 
in  figure  5.21(a),  and  the  average  angular  error  [3]  is  0.21°  with  the  standard  deviation 
0.07°.  Compared  to  the  results  for  the  other  gradient  based  methods  reported  in 
Barron  et  al.  [3],  our  new  formulation  gives  the  best  solution  with  100%  flow  density. 
In  fact,  our  adaptive  preconditioner  can  also  be  used  with  the  other  gradient  based 
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Figure  5.22.  The  convergence  curves  of  the  CG,  HCG  and  APCG  algorithms  for  (a) 
translating  square  (b)  rotating  Rubic  image  sequence. 

regularization  formulations  of  the  optical  flow  problem.  For  the  rotating  Rubic  cube 
experiment,  the  computed  optical  flow  after  15  iterations  of  our  APCG  algorithm  is 
shown  in  figure  5.21(b). 

5.5     Summary  and  Discussion 

In  this  chapter,  we  presented  a  novel  physically-based  adaptive  precondition- 
ing technique  which  was  used  in  conjunction  with  a  conjugate  gradient  algorithm  to 
drastically  improve  the  speed  of  convergence  for  solving  various  early  vision  prob- 
lems. A  preconditioner  based  on  the  membrane  spline  or  the  thin  plate  spline  or  a 
convex  combination  of  the  two  was  termed  as  a  physically-based  preconditioner  for 
obvious  reasons.  The  adaptation  of  the  preconditioner  to  an  early  vision  problem 
was  achieved  via  the  explicit  use  of  the  spectral  characteristics  of  the  regularization 
filter  in  conjunction  with  the  data.  This  spectral  function  was  used  to  modulate  the 
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frequency  characteristics  of  a  chosen  wavelet  basis  leading  to  the  construction  of  our 
preconditioner.  The  preconditioning  technique  was  demonstrated  for  the  surface  re- 
construction, shape  from  shading  and  optical  flow  computation  problems.  Through 
experiments,  we  demonstrated  the  superiority  of  our  preconditioning  method  over 
existing  methods. 

Our  adaptive  preconditioning  technique  can  be  used  in  conjunction  with  any 
wavelet  basis.  The  choice  of  the  wavelet  basis  is  crucial  to  achieve  efficient  pre- 
conditioning. The  ideal  wavelet  needs  to  be  well  localized  in  spatial  (or  time)  and 
frequency  domain.  Good  spatial  localization  makes  the  QMF  implementation  very 
efficient,  i.e.,  the  wavelet  transform  can  be  accomplished  very  efficiently.  Good  local- 
ization in  frequency  domain  of  the  wavelet  is  necessary  to  achieve  fast  convergence 
since  the  the  accuracy  of  spectral  approximation  to  the  regularization  filter  depends 
on  the  localization  in  frequency  domain  of  the  wavelet.  But  the  localization  in  spa- 
tial and  frequency  domain  cannot  be  arbitrarily  small,  because  the  product  of  their 
bandwidths  is  lower  bounded  by  a  constant  ^-,  which  is  known  as  the  uncertainty 
principle  or  Heisenberg  inequality.  This  implies  that  there  is  a  tradeoff  between  the 
spatial  and  frequency  localization.  Gaussian  functions  are  optimal  in  the  sense  that 
they  meet  the  bound  with  equality  [20].  If  we  use  the  wavelet  with  the  optimal 
property,  the  preconditioning  should  be  more  efficient. 


CHAPTER  6 
ROBUST  OPTICAL  FLOW  COMPUTATION 


6.1     Introduction 

Optical  flow  computation  is  a  fundamental  problem  in  the  motion  analysis  of 
image  sequences.  It  provides  very  important  information  for  estimating  3-D  velocity 
fields,  analyzing  rigid  and  nonrigid  motion,  segmenting  the  image  into  regions  based 
on  their  motion,  or  recovering  3-D  structure  of  objects  in  the  image.  Many  techniques 
for  computing  optical  flow  have  been  proposed  in  literature.  These  can  be  classified 
into  gradient-based,  correlation-based,  energy-based,  and  phase-based  methods  [3]. 
In  this  chapter,  we  present  two  robust  and  efficient  algorithms  namely,  a  modified 
gradient-based  algorithm  and  an  SSD-based  regularization  algorithm,  for  optical  flow 
computation. 

The  gradient-based  approach  is  dependent  on  the  image  flow  constraint  equa- 
tion, which  is  derived  from  the  brightness  constancy  assumption  as  well  as  the  first- 
order  Taylor  series  approximation  [40].  Using  the  image  flow  constraint  equation 
alone  is  not  enough  to  compute  the  optical  flow  since  each  equation  involves  two 
different  variables.  Horn  and  Schunck  [40]  introduced  the  membrane  smoothness 
measure  to  constraint  the  flow  field.  Since  the  smoothness  constraint  is  invalid  across 
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the  motion  boundary,  Nagel  [56]  proposed  the  "oriented  smoothness"  measure  to  sup- 
press the  smoothness  constraint  in  the  direction  perpendicular  to  boundaries.  How- 
ever, the  smoothness  across  motion  boundaries  problem  can  be  resolved  by  using  the 
regularization  with  discontinuities  [80]. 

A  major  problem  with  the  gradient-based  approach  is  the  unreliability  of  the 
image  flow  constraint  equation  in  the  areas  whose  local  brightness  function  is  highly 
nonlinear.  These  areas  occur  in  scenes  containing  highly  textured  regions,  motion 
discontinuities,  or  depth  discontinuities.  The  first-order  Taylor  series  approximation 
used  in  the  derivation  of  the  image  flow  constraint  equation  leads  to  inaccuracies 
when  the  higher  order  terms  are  significant.  The  higher  order  terms  are  neglected 
in  the  derivation  of  the  image  flow  constraint  equation  by  assuming  the  time  step 
between  consecutive  frames  is  arbitrarily  small,  i.e.  approaches  0,  which  is  far  from 
being  practical.  Usually,  the  time  step  between  consecutive  images  is  fixed  and  is  not 
arbitrarily  small.  Therefore,  the  gradient  constraint  equation  is  reliable  only  when 
the  higher  order  derivatives  of  the  brightness  function  are  insignificant,  i.e.,  the  local 
brightness  function  is  well  approximated  by  a  linear  function.  Most  of  the  gradient- 
based  methods  do  not  account  for  the  reliability  of  the  gradient  constraint  equation. 
In  [32],  Heitz  and  Bouthemy  used  the  hypotheses  testing  technique  to  determine 
whether  the  image  flow  constraint  equation  is  valid  at  each  site  in  the  image  plane. 

The  above  mentioned  gradient  constraint  is  very  unreliable  in  the  neighborhood 
of  discontinuities  of  the  brightness  function.  However,  the  contour-based  methods 
can  give  robust  estimates  of  the  optical  flow  along  the  zero-crossing  contours,  which 
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are  the  discontinuity  locations  of  the  image  brightness  function.  The  contour-based 
methods  use  a  different  flow  constraint  along  the  zero-crossing  contours.  This  flow 
constraint  along  contours  is  obtained  by  replacing  the  image  brightness  function  I  by 
the  convolution  of  I  with  the  second  derivative  of  Gaussian  smoothing  function.  We 
propose  to  use  the  image  flow  constraint  in  the  areas  away  from  the  brightness  dis- 
continuities and  use  the  contour-based  flow  constraint  at  the  discontinuity  locations. 
Thus,  in  our  gradient-based  regularization  method,  the  unreliable  image  flow  con- 
straints near  discontinuity  locations  are  replaced  by  the  robust  contour-based  flow 
constraints  at  the  zero-crossing  locations.  This  yields  an  accurate  estimate  of  the 
optical  flow  field  vectors. 

The  correlation-based  approach  locally  finds  the  displacement  vector  (u,  v)  be- 
tween two  images  70  and  I\  at  the  location  (x,  y)  by  minimizing  the  sum  of  squared 
difference  (SSD)  function 


k  k 

SSD(x;u,v)=  Y,    J2w{i,j)[I1(x  +  i  +  u,y+j  +  v)-I0(x  +  i,y  +  j)]2     (6.1) 

j=-k  i=-k 


where  w(i,j)  is  the  weighting  function.  In  this  SSD  function,  the  summation  is 
performed  in  a  window  of  size  (2k  +  1)  x  (2k  +  1)  centered  at  (x,  y).  Most  correlation- 
based  methods  do  an  extensive  search  of  the  displacement  vector  (u,  v)  in  a  finite 
integer-pair  set  and  find  the  pair  with  the  smallest  SSD  value  to  be  the  displacement 
at  that  location.  Since  the  search  region  of  the  displacement  vector  is  discretized 
for  an  extensive  search,  the  accuracy  of  the  computed  optical  flow  is  limited  by  this 
discretization.  In  addition,  the  correlation-based  approach  is  incapable  of  producing 
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reliable  optical  flow  estimates  in  a  homogeneous  region.  Anandan  [2]  treated  the 
estimates  provided  by  the  matching  process  as  the  data  constraints  for  optical  flow 
with  appropriate  confidence  measure,  and  incorporated  the  smoothness  constraint  on 
optical  flow  to  achieve  better  results.  Recently,  Szeliski  and  Coughlan  [77]  used  the 
two-dimensional  spline  model  to  represent  the  flow  field  and  minimized  the  following 
SSD  function 

£(u)  =  £  XX*  +  u«i.  J  +  w«)  "  'o(i,  j)]2  (6.2) 

where  the  vector  u  is  the  concatenation  of  the  flow  components  u,j  and  u,j.  The 
Levenberg-Marquardt  algorithm  was  then  employed  to  solve  this  nonconvex  opti- 
mization problem.  They  reported  very  accurate  results  using  this  method.  The  2-D 
spline  models  for  optical  flow  field  assume  the  flow  field  to  be  well-approximated 
by  the  2-D  spline  basis  functions  in  the  preset  patches.  However,  it  is  difficult  to 
incorporate  discontinuities  into  these  2-D  spline  patches.  In  our  method,  we  use  the 
standard  finite  difference  discretization  on  the  flow  field  and  take  the  SSD  as  the 
data  constraint  energy  in  a  regularization  framework.  A  preconditioned  nonlinear 
conjugate  gradient  algorithm  is  employed  to  minimize  the  total  energy  function.  The 
experimental  results  on  the  standard  synthetic  image  sequence  using  our  algorithm 
are  better  than  or  comparable  to  the  best  existing  results  reported  in  literature. 
Unlike  some  other  algorithms  that  only  generate  sparse  optical  flow  estimates,  our 
algorithms  give  dense  optical  flow  estimates.  In  addition,  motion  discontinuities  can 
be  incorporated  in  our  (regularization)  framework. 
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Recently,  Barron  et  al.,  [3]  have  presented  a  comprehensive  survey  of  various 
optical  flow  computation  techniques  along  with  a  quantitative  comparison  of  them. 
From  their  experimental  results,  they  argue  that  the  phase-based  method  by  Fleet 
and  Jepson  [18]  gives  the  most  accurate  result.  However,  most  of  the  reported  results 
are  only  for  sparse  optical  flow  measurements  obtained  after  thresholding  unreliable 
components,  present  in  the  neighborhood  of  phase  singularities. 

The  remainder  of  this  chapter  is  organized  as  follows.  In  the  next  section,  we 
present  our  modified  gradient-based  method  that  combines  the  region-based  gradient 
constraint  in  the  regions  away  from  brightness  discontinuities  and  the  contour-based 
gradient  constraint  at  these  discontinuity  locations.  In  section  6.3,  our  SSD-based 
regularization  technique  for  optical  flow  computation  from  two  images  of  a  time 
sequence  is  proposed.  The  experimental  results  for  both  algorithms  on  synthetic  and 
real  image  sequences  are  presented  in  section  6.4.  Finally,  we  conclude  this  chapter 
in  section  6.5. 

6.2     Modified  gradient-based  method 

At  the  foundation  of  the  standard  gradient- based  approach  is  the  image  flow 
constraint  equation, 

Ix(x,  y,  t)u(x,  y,  t)  +  Iy(x,  y,  t)v(x,  y,  t)  +  It(x,  y,  t)  =  0  (6.3) 

where  I(x,y,t)  is  the  image  brightness  function  at  location  (x,y)  and  time  t,  Ix,Iy 
and  7(  are  the  partial  derivatives  of  /  with  respect  to  x,  y  and  t,  respectively,  and 
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(u,  v)  is  the  optical  flow  field.  The  image  flow  constraint  equation  has  been  used  in 
different  ways  to  obtain  the  optical  flow  estimates  as  reported  in  the  literature  on 
gradient-based  methods  [40,  49,  55,  84].  In  a  regularization  framework,  the  image 
flow  constraint  equation  is  regarded  as  the  data  constraint  and  additional  smoothness 
constraint  is  imposed  on  the  optical  flow. 

The  image  constraint  equation  is  derived  by  taking  the  Taylor  expansion  of  the 
image  constancy  equation  dI{xJ'l)  =  0  up  to  the  first-order  terms.  The  higher-order 
terms  are  neglected  under  the  assumption  that  the  time  step  between  consecutive 
frames  is  arbitrarily  small,  which  is  most  often  violated  in  practice.  In  reality,  the 
time  step  is  usually  fixed.  Therefore,  the  lack  of  higher-order  terms  becomes  the  main 
source  of  errors  in  the  data  constraint.  The  reliability  of  the  image  flow  constraint 
equation  depends  on  the  magnitudes  of  the  higher  order  derivatives  of  image  bright- 
ness function.  If  the  image  brightness  function  in  the  neighborhood  of  one  point  is 
well  approximated  by  a  linear  function,  i.e.  its  higher-order  derivatives  are  small, 
then  the  image  flow  constraint  is  very  reliable  at  this  point.  On  the  contrary,  the 
image  flow  constraint  is  very  unreliable  at  the  locations  with  significant  higher-order 
derivatives,  which  are  mainly  the  neighborhood  of  brightness  discontinuities.  Since 
the  use  of  an  unreliable  flow  constraint  will  degrade  the  accuracy  of  the  optical  flow 
estimation,  we  propose  to  disable  this  unreliable  image  flow  constraint  in  the  neigh- 
borhood of  discontinuities  and  enable  the  robust  contour-based  flow  constraint  at  the 
discontinuity  locations  in  a  regularization  framework. 
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The  contour-based  methods  [35,  17]  compute  the  optical  flow  along  the  zero- 
crossing  contours  by  using  the  following  flow  constraint  equation 

Sx(x,  y,  t)u(x,  y,  t)  +  Sy(x,  y,  t)v(x,  y,  t)  +  St(x,  y,t)  =  0  (6.4) 

where  S(x,  y,  t)  is  the  convolution  of  I(x,  y,  t)  with  the  second  derivative  of  the  Gaus- 
sian function,  Sx,  Sy,  and  St  are  the  partial  derivatives  of  S  with  respect  to  x,  y  and 
t,  respectively.  In  [35],  the  function  S  was  chosen  to  be  the  convolution  of  the  Lapla- 
cian  of  Gaussian  (LOG)  filter  with  the  image  brightness  function,  i.e.  V2G*  J,  where 
V2  =  #r  +  ^r,  G(x,y)  is  the  2-D  spatial  Gaussian  function  and  the  convolution  is 
performed  in  2D. 

Now,  we  incorporate  the  above  two  flow  constraint  equations  into  the  regu- 
larization  framework.  The  image  flow  constraint  in  equation  6.3  is  disabled  in  the 
neighborhood  of  brightness  discontinuities,  while  the  contour-based  flow  constraint 
is  active  along  the  zero-crossing.  The  variational  formulation  of  the  optical  flow 
problem  using  this  selective  scheme  for  the  data  constraint  leads  to  minimizing  the 
following  functional 

/  a((V/.u)  +  /t)2  +  A(||  Vu  H2.  +  ||  Vt>  \\22)dx+  i  p(VS»u+St)2dx  (6.5) 

JQ  J  zero— crossing 

where  u(x,  t)  =  (u(x,  t),v(x,  t))  is  the  optical  flow  field  to  be  estimated,  x  =  (x,y) 
is  a  2-D  vector,  fl  is  the  2-D  image  domain,  a,  A  and  0  are  the  weighting  functions 
associated  to  the  image  flow  constraint,  smoothness  constraint  and  contour-based 
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flow  constraint,  respectively.  In  this  dissertation,  the  weighting  functions  A  and  /? 
are  chosen  to  be  constants,  and  a(x)  has  value  0  when  x  is  in  the  neighborhood  of 
discontinuities  and  takes  a  constant  value  elsewhere.  A  discrete  version  of  the  above 
energy  can  be  written  as 

£(£,,,-«,•  +  Erf*  +  Et,t)2  +  A«t-  +  <t-  +  <t-  +  v2y-)  (6.6) 

i 

Where,  the  index  i  denotes  the  i-th  location,  the  indices  x,  y  and  t  denote  the  partial 
derivatives  along  the  corresponding  directions,  u  and  v  are  the  discretized  flow  vector, 
and  the  function  E  is  defined  as  follows 


ali     when      i  G  D\N 
Ei  =  { 

(3Si    when      i  €  Z 


The  set  D  contains  all  the  discretized  locations  in  the  image  domain,  AT  is  the  set  of 
locations  in  the  neighborhood  of  discontinuities,  and  the  set  Z  contains  the  discretized 
locations  along  the  zero-crossing  contours. 

The  image  flow  data  constraint  and  the  contour-based  flow  constraint  are  ob- 
tained by  using  the  first-order  Taylor  series  approximation  of  the  functions  /  and  S, 
respectively,  thus  each  constraint  is  unreliable  in  the  regions  where  the  underlying 
function  (  /  or  S  )  is  not  well  approximated  by  a  linear  function  locally,  which  means 
the  function  contains  significant  nonlinear  terms.  We  define  a  function  6(i)  to  be  a 
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measure  of  reliability  for  the  data  constraint  at  i-th  location  as  follows 


,,.,      \EMti\  +  [jWj  +  \Ett,i\  +  \Exy,t\  +  \Extj\  +  \Eytii\ 

S{1)  =  1^,1  +  1^,1  +  1^,1  •  (6J) 


The  partial  derivatives  of  the  function  E  in  equation  6.7  can  be  estimated  by  standard 
finite  difference  methods.  The  reliability  measure  6(i)  is  the  ratio  between  the  sums 
of  the  absolute  values  of  the  second-order  and  first-order  partial  derivatives  of  the 
function  E  at  the  i-th  location.  Therefore,  the  corresponding  data  constraint  is 
reliable  when  the  measure  6  is  small  and  unreliable  when  8  is  large.  We  can  set  a 
threshold  to  reject  the  data  constraints  with  a  high  6  function  value.  This  threshold 
was  set  to  1.0  in  our  implementation. 

The  data  constraint  in  the  above  energy  function  implies  the  weighting  /£,-  + 
Iyi  or  S^  +  5?fj  on  («,-,Uj)  at  site  i.  This  weighting  is  not  reasonable  since  the 
flow  constraints  at  high  gradient  locations  are  not  necessarily  more  reliable  than 
those  at  low  gradient  locations.  In  fact,  they  tend  to  be  less  reliable  because  the 
brightness  gradients  close  to  discontinuities  are  normally  higher  than  those  away  from 
discontinuities.  Therefore,  we  eliminate  this  weighting  on  each  data  constraint  via 
normalization.  To  have  a  uniform  contribution  at  each  pixel  from  the  flow  constraint, 
we  divide  the  flow  constraint  at  the  i-th  location  by  the  normalization  factor  (1%  ,•  + 
^u,)5  or  (^i  +  ^i)J>  Let's  denote  the  normalized  flow  constraint  by  EXtiUi  +  EVtiVi  + 
Etti  =  0,  then  the  energy  function  to  be  minimized  becomes 

/(u)  =  *E(E*,iUi  +  Ey<iVi  +  Etii)2  +  X(u2Xti  +  u2Vfi  +  vlti  + 1£.)  (6.8) 
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where  the  vector  u  is  the  concatenation  of  all  the  components  u,  and  v,\ 

The  minimization  of  the  above  energy  leads  to  solving  a  linear  system  of  equa- 
tions Ku  =  b  where  the  stiffness  matrix  K  €  sft2™2*2"2  is  symmetric  positive-definite 
and  it  has  the  following  2x2  block  structure. 


K 


K.  +  Hjxx        E 


xy 


Eixy        Ka  +  E 


yy 


(6.9) 


where  Ks  G  dtn2*n2  is  the  discrete  2-D  Laplacian  matrix  from  the  membrane  smooth- 

— 2 

ness  constraint,  Exx,  Exy,  and  Eyy  are  all  n2  x  n2  diagonal  matrices  with  entries  Exi, 
Ex,iEy,i  and  Eyi,  respectively. 

To  efficiently  solve  the  above  linear  system  for  computing  optical  flow,  we  use 
the  adaptive  preconditioned  conjugate  gradient  algorithm  developed  in  chapter  5. 

6.3     SSD-based  Regularization  Method 

The  SSD  function  in  equation  6.2  can  be  used  as  the  data  constraint  energy  in 
a  regularization  framework.  Incorporating  the  membrane  smoothness  and  the  SSD 
constraints  into  the  regularization  formulation,  we  obtain  the  total  energy  function 


where  (x,-,y,-)  is  the  i-th  discretized  location  in  the  image  domain.  Note  that  the  data 
constraint,  h(xi  +  «,-,y,-  +  t>,-)  =  Io{xi,  Vi),  is  not  linear  in  the  flow  vector  Ui  or  u,-. 
If  the  Taylor  expansion  of  the  data  constraint  is  taken  up  to  first  order  terms,  we 
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obtain  the  image  flow  constraint  in  equation  6.3.  Consequently,  the  gradient-based 
regularization  approach  can  be  perceived  as  an  approximation  to  the  SSD-based 
regularization  formulation.  The  energy  function  to  be  minimized  in  the  gradient- 
based  regularization  is  quadratic  and  convex,  while  the  energy  function  in  equation 
6.10  is  neither  quadratic  nor  convex.  For  the  quadratic  and  convex  energy  function  in 
the  gradient-based  regularization,  the  energy  minimization  can  be  accomplished  by 
solving  an  SPD  linear  system.  The  minimization  of  the  nonquadratic  and  nonconvex 
energy  function  is  more  difficult  and  requires  more  computational  effort.  The  SSD- 
based  regularization  can  give  more  accurate  estimation  than  that  from  the  gradient- 
based  regularization,  since  the  data  constraint  used  in  the  SSD-based  regularization 
is  more  precise  than  that  in  the  gradient-based  regularization.  However,  it  definitely 
requires  more  computational  work  to  minimize  the  energy  function. 

The  total  energy  function  in  equation  6.10  is  a  nonlinear  function  with  a  large 
number  of  variables.  We  use  the  nonlinear  conjugate  gradient  method  [23]  with 
a  modified  search  direction  to  minimize  this  nonlinear  energy  function  in  equation 
6.10.  The  nonlinear  conjugate  gradient  algorithm  is  very  efficient  in  the  unconstrained 
optimization  of  large-scale  nonlinear  functions  [23].  Although  this  algorithm  can  only 
find  the  local  minima,  we  can  obtain  very  accurate  solution  when  the  initial  solution 
is  close  to  the  true  solution.  For  example,  small  displacement  between  consecutive 
frames  can  be  easily  found  when  we  start  the  nonlinear  CG  from  zero  initial  values. 
In  fact,  a  very  good  initial  solution  can  be  obtained  from  the  optical  flow  estimated  in 
the  previous  frames.  In  addition,  the  scale  space  tracking  scheme  [89]  can  be  employed 
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to  take  care  of  large-displacement  problems  and  to  accelerate  the  convergence  rate  of 

the  nonlinear  CG  algorithm. 

6.3.1     Nonlinear  Conjugate  Gradient  Algorithm 

The  nonlinear  conjugate  gradient  method  for  minimizing  general  nonlinear  func- 
tions is  outlined  as  follows  [23]: 

1.  d0  =  r0  =  -/'(u0). 

2.  Find  a,  that  minimizes  /(u,  +  a,-d,-). 

3.  u,+1  =  u,  +  Q,d,. 

4.  r,+i  =  -/'(u,+1). 

5.  &+,  =  &|£l  or  #+1  =  max{^±pl,0}. 

6.  d,+1  =  ri+i  -I-  /3,+id,. 

7.  If  rjt  c±  0,  stop;  else  set  i  :=  i  +  1  and  go  to  step  2. 

Step  2  requires  a  line  search  method  to  minimize  the  function  /(u,  +  aid,)  at  u,  along 
the  direction  d,.  The  Newton- Raphson  method  and  the  Secant  method  are  two  popu- 
lar line  search  schemes  to  approximate  the  best  a,.  We  employ  the  Newton-Raphson 
method  in  the  nonlinear  CG.  The  Taylor  series  approximation  of  the  function  /  up 
to  the  second-order  terms  is  used  to  derive  the  approximate  a,,  giving 


/'(u,)d, 
°'  -  -df/»(u.)d,-  <6U) 
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In  step  5,  there  are  two  ways  for  choosing  f}.  The  first  is  Fletcher- Reeves  formula  and 
the  other  is  Polak-Ribiere  formula.  We  adopt  the  latter  since  it  usually  converges 
much  more  quickly,  although  the  nonlinear  CG  needs  to  restart  when  0  =  0.  Note 
that  /'(u)  and  /"(u)  in  this  algorithm  denote  the  gradient  vector  and  the  Hessian 
matrix  of  the  function  /.  From  equation  6.10,  we  can  obtain 


/'(u)  = 


• 

f 

J   d,u 

+  A 

f'd,v 

K,     0 


0     K, 


u 


/»  = 


(6.12) 


(6.13) 


where  the  matrix  K,  is  the  Laplacian  matrix  obtained  from  the  membrane  smoothness 
constraint,  the  vectors  f'd u  and  f'dv  are  the  concatenations  of  the  components  f'dyUyi 
and  f'dvi  respectively,  Kd)UU,  Kd,Uv  and  Kd,Vv  are  all  diagonal  matrices  with  the  i-th 
diagonal  entries  Kd,Uu,i,  Kd,w,i  and  Kd,vv,i  respectively.  The  components  /'dtU>i,  f'd,v,ii 
Kd,uu,i,  Kd,uv,i  and  Kd,w,i  corresponding  to  the  i-th  location  are  defined  as  follows: 


/  d,ii,i 

f 

J   d,v,i 

f*d,uv,i 
**-d,vv,i 


IV 


W 


dh(xi  +  Ui,yj  +  Vj) 

dx 
dliixj  +  Uj^yi  +  Vj) 

dx 
sdhjxi  +  ui^i  +  vi)  2      d2h(xi  +  Ui,yi  +  Vi) 
{  dx  '  dx*  W 

dhjxj  +  iij,  y,  +  t;,-)     dh{xj  +  Uj,  yt-  +  t>,-)        d2h{xi  +  Uj,yj  +  vA 
^  dx  dy  dxdy 


(6.14) 
(6.15) 
(6.16) 


(dli(xj  +  uuyi  +  vi)  2      d2I1{xi  +  ui,yi  +  vi) 
[  dy  }   +  dy* 


(6.18) 
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where  w  =  h(xi  +  u,-, j/j  +  v,-)  —  Io(xi,Vi)-  In  the  above  equations,  the  values  of  the 
image  function  h  and  its  partial  derivatives  at  (xi  +  u,-,  ?/,•  +  Vi)  need  to  be  computed 
using  any  interpolation  technique.  In  our  implementation,  we  used  the  interpolating 
spline  under  tension  [57]  to  compute  these  values. 

6.3.2     Modified  Search  Direction  Scheme 

The  nonlinear  conjugate  gradient  algorithm  is  a  local  search  method  which  in- 
volves approximating  the  local  landscape  of  the  energy  function  by  a  quadratic  func- 
tion. The  use  of  gradient  to  obtain  the  search  direction  is  not  appropriate  at  the 
locations  where  the  local  landscape  of  the  energy  function  has  large  higher-order 
derivatives,  i.e.  the  local  landscape  is  very  nonlinear.  We  propose  a  new  search 
direction  scheme  to  modify  the  gradient  direction  in  nonlinear  conjugate  gradient  al- 
gorithm. The  idea  of  the  modified  search  direction  scheme  is  to  check  the  smoothness 
of  the  local  energy  function  for  every  component  of  flow  vector.  From  equation  6.10, 
we  can  see  that  the  smoothness  energy  is  already  quadratic,  therefore,  we  need  only 
to  consider  the  smoothness  of  the  first  term,  i.e.,  the  data  constraint  energy.  To  mea- 
sure the  local  smoothness  of  the  data  constraint  energy  for  the  flow  vector  (w,-,  u,-), 
we  use  the  inner  product  of  the  normalized  gradients  of  the  two  image  brightness 
functions  as  a  smoothness  measure  cr(u,-,t;,),  i.e. 


Wl-  ||VJ1(s,-  +  u,-,y,-  +  t;,-)||    ||VJo(s,-,y.-)ir  {        ' 
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Figure  6.1.  The  discrepancy  between  the  gradient  directions  of  the  function  IQ  and 
Ji  at  the  i-th  location  is  small  and  thus  the  use  of  the  gradient  direction  as  the 
search  direction  for  the  i-th  component  is  robust;  while  the  the  j-th  component  of  the 
gradient  direction  is  unreliable  for  a  local  search. 

Figure  6.1  depicts  the  gradient  directions  of  I0  and  I\  at  two  locations.  The 
gradient  directions  at  the  i-th  location  are  close,  therefore  the  use  of  it  as  the  search 
direction  is  robust  for  the  i-th  component.  The  two  gradient  directions  at  the  j-th 
location  differ  by  a  significant  angle,  which  means  the  j-th  component  of  the  gradient 
is  unreliable  as  a  local  search  direction. 

We  can  set  a  threshold  on  the  smoothness  measure  in  order  to  reject  the  unre- 
liable components  of  the  gradient  direction  as  the  basis  of  a  local  search  direction. 
In  addition,  this  smoothness  measure  is  also  used  as  a  confidence  measure  for  flow 
vectors. 

6.4     Experimental  Results 

In  this  section,  our  modified  gradient-based  regularization  algorithm  and  SSD- 
based  regularization  algorithm  are  tested  on  a  variety  of  synthetic  and  real  image 
sequences.  Comparisons  of  our  results  with  the  previously  reported  results  in  [3,  77] 
are  made  to  demonstrate  the  performance  of  our  proposed  algorithms. 
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The  angular  measure  of  error  [3]  between  the  computed  optical  flow  and  the 
true  flow  is  adopted  in  the  experiments  as  a  basis  for  comparison  with  the  previously 
reported  results.  Unlike  some  other  methods  which  only  produce  sparse  optical  flow, 
both  of  our  regularization  methods  presented  in  this  chapter  give  dense  optical  flow 
estimates  with  100%  density.  To  achieve  a  fair  comparison  with  the  reported  results, 
only  those  techniques  reporting  100%  flow  density  in  literature  are  included  in  our 
comparative  study. 

6.4.1     Modified  Gradient-based  Method 

In  the  implementation  of  our  modified  gradient-based  regularization  method, 
we  treat  the  zero-crossing  points  of  S(x,y,t)  as  the  discontinuity  locations.  The 
computation  of  the  derivatives  Ix,Iy  and  It  is  very  error-sensitive.  Therefore,  we 
pre-smooth  the  image  brightness  function  with  the  Gaussian  filter  prior  to  using  the 
central  difference  method  to  approximate  the  derivatives.  As  for  Sx,  Sy  and  St,  we 
directly  use  the  central  difference  scheme  to  approximate  them  since  the  process  of 
convolution  with  the  LOG  (Laplacian  of  Gaussian)  filter  for  generating  the  function 
S  already  involves  the  smoothing  operation. 

Two  standard  synthetic  image  sequences  are  tested  by  using  our  modified  gradient- 
based  regularization  method  to  compute  the  optical  flow.  These  two  image  sequences 
are  the  images  of  a  translating  Square  and  a  Diverging  Tree  reported  in  [3].  Figure 
6.2  shows  one  frame  from  each  sequence.  The  image  sizes  for  the  square  and  diverging 
tree  are  100  x  100  and  150  x  150  respectively. 
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(a) 


(b) 


Figure  6.2.  Frames  from  the  (a)  Square  2  and  (b)  Diverging  Tree  image  sequences. 

The  motion  in  the  Square  2  sequence  is  a  constant  translation.  In  our  experi- 
ment, the  regularization  parameter  A  is  chosen  to  be  10,  and  the  weighting  parameters 
a  and  fi  for  the  image  flow  constraint  and  contour-based  constraint  are  set  to  1.  The 
optical  flow  obtained  after  40  iterations  of  the  incomplete  Cholesky  preconditioned 
conjugate  gradient  algorithm  is  shown  in  figure  6.3(a).  The  errors  of  our  modi- 
fied gradient-based  regularization  method  and  the  other  gradient-based  methods  for 
this  translating  square  example  are  given  in  Table  6.1.  Only  the  results  from  those 
gradient-based  methods  that  yield  100%  density  flow  estimates  reported  in  [3]  are 
included  in  this  table  for  comparison.  It  is  obvious  that  our  modified  gradient-based 
regularization  method  drastically  outperforms  the  other  gradient-based  methods  in 
this  example.  Note  that  the  results  of  the  other  two  gradient-based  methods  in  table 
6.1  for  this  image  sequence  could  be  improved  by  increasing  the  degree  of  smoothness, 
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i.e.,  the  value  of  the  regularization  parameter  A,  since  the  correct  optical  flow  field 
is  uniform  in  this  example.  However,  large  regularization  parameter  tends  to  over- 
smooth  the  estimated  optical  flow  for  other  image  sequences.  Although  it  is  possible 
to  find  an  optimal  regularization  parameter  by  using  the  generalized  cross-validation 
method  [88],  this  method  is  extremely  computationally  expensive. 


Technique 

Average  Error 

Standard  Deviation 

Density 

Horn  and  Schunck  (modified) 

32.81° 

13.67° 

100% 

Nagel 

34.57° 

14.38° 

100% 

Our  method  (Gradient-based) 

0.21° 

0.06° 

100% 

Table  6.1.  Summary  of  Square  2  results. 
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(a) 


(b) 


Figure  6.3.  Computed  optical  flow  of  the  (a)  Square  2  and  (b)  Diverging  Tree  image 
sequences  after  40  iterations  of  the  incomplete  Cholesky  preconditioned  CG  algo- 
rithm. 


The  other  synthetic  data  is  the  more  realistic  Diverging  Tree  image  sequence. 
The  underlying  motion  in  this  example  is  a  bilinear  expansion  from  the  center  of 
the  image.    We  use  the  regularization  parameter  A  =  0.1  and  a  =  (3  =  1  in  this 


146 


example.  The  error  in  the  computed  optical  flow  after  40  iterations  of  the  incomplete 
Cholesky  preconditioned  CG  algorithm  is  presented  in  Table  6.2,  along  results  from 
other  gradient-based  methods  in  literature  that  yield  100%  flow  density  [3].  Once 
again,  our  modified  gradient-based  regularization  method  produces  more  accurate 
optical  flow  than  the  other  methods.  The  computed  optical  flow  is  shown  in  figure 
6.3(b). 


Technique 


Horn  and  Schunck  (modified) 


Nagel 


Uras  et  al. 


Our  method  (Gradient-based) 


Average  Error 


2.55° 


2.94° 


4.64° 


1.60° 


Standard  Deviation 


3.67° 


3.23° 


3.48° 


1.15° 


Density 


100% 


100% 


100% 


100% 


Table  6.2.  Summary  of  Diverging  Tree  results. 


Our  gradient-based  algorithm  was  also  tested  on  a  real  data  set  namely,  the 
Rotating  Rubic  cube.  Figure  6.4  shows  one  frame  of  the  Rubic  Cubic  sequence  and  the 
computed  optical  flow  after  40  iterations  of  the  incomplete  Cholesky  preconditioned 
conjugate  gradient  algorithm. 

6.4.2     SSD-based  Regularization  Method 

The  experimental  results  of  using  our  SSD-based  regularization  method  on  two 
synthetic  image  data  and  one  real  image  sequence  are  presented  here.  The  two 
synthetic  image  sequences  are  the  Sinusoid  and  the  Translating  Tree  image  sequences. 
Our  results  on  these  two  examples  are  compared  to  the  best  results  with  100%  density 
optical  flow  estimate  reported  in  [3]  and  [77]. 
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(a) 


(b) 


Figure  6.4.  (a)  One  frame  from  the  Rotating  Rubic  Cube  sequence  (b)  Computed 
optical  flow  after  40  iterations  of  the  incomplete  Cholesky  preconditioned  CG  algo- 
rithm. 


(a) 


(b) 


Figure  6.5.  Frames  from  the  (a)  Sinusoid  1  and  (b)  Translating  Tree  image  sequences. 
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The  Sinusoid  1  image  sequence  has  a  constant  translation  with  a  velocity  of 
(1.585,0.863)  pixels/frame.  The  image  size  for  each  frame  of  this  image  sequence 
is  100  x  100.  Each  frame  is  a  highly  textured  image  whose  brightness  function  is 
fast  changing  almost  everywhere.  The  regularization  parameter  A  is  set  to  be  300. 
Very  accurate  estimate  is  obtained  after  300  preconditioned  nonlinear  conjugate  gra- 
dient iterations.  Figure  6.6(a)  shows  the  computed  optical  flow.  Our  result  is  the 
best  among  the  three  depicted  in  Table  6.3.  Note  that  our  method  uses  only  two 
consecutive  images  to  estimate  the  optical  flow.  The  results  of  Fleet  and  Jepson's 
method  [18]  for  this  example  were  obtained  by  using  a  21-frame  image  sequence  [3]. 
To  achieve  a  fair  comparison,  we  depict  the  reported  results  of  Szeliski  and  Coughlan 
[77]  for  the  2-frame  image  sequence.  With  incorporation  of  more  frames  of  data,  the 
results  in  Szeliski  and  Coughlan  do  however  improve  over  those  depicted  in  the  table. 


Technique 

#of 
Frames 

Average 
Error 

Standard 
Deviation 

Density 

Fleet  and  Jepson 

21 

0.03° 

0.01° 

100% 

Szeliski  and  Coughlan 

2 

0.17° 

0.02° 

100% 

Our  method  (SSD-based) 

2 

0.02° 

0.01° 

100% 

Table  6.3.  Summary  of  Sinusoid  1  results. 


For  the  translating  tree  example,  each  frame  contains  more  complicated  highly 
textured  regions,  which  result  in  multiple  local  minima  in  the  associated  energy 
function  to  be  minimized.  To  avoid  being  trapped  in  a  local  minimum  during  the 
minimization  of  a  nonconvex  energy  function,  we  employ  the  scale-space  tracking 
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(a) 


(b) 


Figure  6.6.  Computed  optical  flow  of  the  (a)  Sinusoid  1  and  (b)  Translating  Tree 
image  sequences,  using  the  preconditioned  nonlinear  CG  algorithm. 

scheme  [89].  The  scale-space  tracking  is  achieved  by  applying  our  algorithm  on  the 
smoothed  image  pair  with  the  degree  of  smoothing  being  gradually  reduced  during 
the  energy  minimization.  We  start  from  a  a  =  1.1547  for  Gaussian  smoothing,  reduce 
to  a  a  =  0.5774  after  100  iterations  of  the  preconditioned  nonlinear  CG  algorithm, 
and  finally  set  a  to  0  after  200  iterations.  The  computed  optical  flow  is  shown  in  figure 
6.6(b).  The  error  in  the  computed  optical  flow  is  depicted  in  Table  6.4  along  with 
the  errors  of  best  results  with  100%  density  reported  in  [3]  and  [77].  Our  SSD-based 
regularization  method  gives  the  most  accurate  result  from  among  these  methods. 
Note  that  the  result  of  Szeliski  and  Coughlan  in  this  table  is  adopted  for  the  two- 
frame  case  to  be  compared  to  the  result  of  our  two-frame  algorithm,  although  more 
accurate  estimation  can  be  obtained  via  incorporating  more  frames  into  the  energy 
function  as  shown  in  [77]. 
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Technique 

#of 
Frames 

Average 
Error 

Standard 
Deviation 

Density 

Uras  et  al. 

15 

0.62° 

0.52° 

100% 

Szeliski  and  Coughlan 

2 

0.35° 

0.34° 

100% 

Our  method  (SSD-based) 

2 

0.30° 

0.19° 

100% 

Table  6.4.  Summary  of  Translating  Tree  results. 
Finally,  our  SSD-based  regularization  method  is  applied  to  a  real  image  sequence 
namely,  the  Hamburg  Taxi  sequence.  One  frame  of  this  sequence  and  the  computed 
optical  flow  are  shown  in  figure  6.7.  The  computed  optical  flow  between  two  con- 
secutive images  was  obtained  after  200  iterations  of  the  preconditioned  nonlinear 
conjugate  gradient  algorithm. 


(a) 


(b) 


Figure  6.7.   (a)  One  frame  from  the  Hamburg  Taxi  sequence  (b)  Computed  optical 
flow  after  200  iterations  of  the  preconditioned  nonlinear  CG  algorithm. 


6.5     Summary  and  Discussion 

In  this  chapter,  we  presented  two  new  algorithms  for  robust  and  efficient  compu- 
tation of  the  optical  flow  from  an  image  sequence.  One  is  the  modified  gradient-based 
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regularization  algorithm,  and  the  other  is  the  SSD-based  regularization  algorithm. 
The  modified  gradient-based  regularization  method  selectively  combines  the  image 
flow  constraint  and  the  contour-based  flow  constraint  into  the  data  constraint.  The 
image  flow  constraint  is  disabled  in  the  neighborhood  of  discontinuities,  while  the 
contour-based  flow  constraint  is  active  at  discontinuity  locations.  A  more  precise 
data  constraint  is  obtained  by  using  this  selective  combination  scheme  than  the  one 
used  by  other  gradient-based  methods  in  literature.  Flow  estimates  obtained  us- 
ing our  method  are  more  accurate  than  those  obtained  from  techniques  reported  in 
literature  [3].  This  is  evident  from  the  experimental  results. 

Our  SSD-based  regularization  method  uses  the  SSD  measure  as  the  data  con- 
straint in  a  regularization  framework.  The  resulting  energy  function  is  neither  quadratic 
nor  convex.  The  preconditioned  nonlinear  conjugate  gradient  with  a  modified  search 
direction  is  developed  to  minimize  this  energy  function.  Very  accurate  results  were 
obtained  by  applying  this  algorithm  to  two  consecutive  images  of  a  sequence. 

Comparing  these  two  algorithms  proposed  in  this  chapter,  we  conclude  that  the 
modified  gradient-based  regularization  algorithm  is  more  computationally  efficient 
than  the  SSD-based  regularization  algorithm,  while  the  latter  gives  more  accurate 
results  than  the  former.  Also,  we  have  experimentally  demonstrated  the  superiority 
of  our  algorithms  for  optical  flow  computation  over  those  reported  to  date  in  litera- 
ture. Future  work  will  focus  on  automatically  detecting  motion  discontinuities  in  the 
framework  presented  here. 


CHAPTER  7 
RESEARCH  DIRECTIONS 


7.1.     Introduction 

In  this  thesis,  we  have  presented  efficient  computational  methods  for  early  vision 
problems  with  prescribed  discontinuities.  In  this  context,  several  research  issues 
presented  in  the  following  paragraphs  will  benefit  from  further  investigation. 

In  the  regularization  and  MRF  formulations  for  early  vision  problems,  the  reg- 
ularization  parameters  are  usually  chosen  in  an  ad  hoc  fashion.  Without  adapting 
the  parameters  to  the  problem,  it  is  easy  to  end  up  with  an  over-smoothed  or  under- 
smoothed  result.  The  generalized  cross-validation  method  [88]  has  been  popularly 
used  to  determine  the  regularization  parameters.  However,  the  computational  cost 
of  this  method  is  extremely  high.  In  addition,  this  method  can  not  be  used  when 
discontinuity  is  included  in  the  problem.  It  is  important  to  have  an  efficient  and 
general  technique  to  determine  the  regularization  parameters  to  obtain  better  results 
for  early  vision  problems. 

The  adaptive  preconditioner  constructed  in  a  wavelet  basis  can  be  applied  to 
any  regularization  formulation  in  early  vision.  In  fact,  we  can  use  similar  construction 
of  the  adaptive  preconditioner  for  the  linear  system  discretized  from  any  boundary 
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value  problems  in  PDEs.  The  extension  is  straightforward  by  replacing  the  spec- 
tral characteristics  of  the  membrane  or  thin-plate  stabilizer  by  that  of  the  associated 
differential  operator.  There  is  no  restriction  on  the  order  of  the  partial  differential 
equation.  From  the  success  of  our  adaptive  preconditioner  in  the  early  vision  prob- 
lems, we  believe  the  adaptive  preconditioner  will  lead  to  drastic  acceleration  for  the 
numerical  solution  of  the  boundary  value  problems. 

In  the  construction  of  our  adaptive  preconditioner,  we  approximate  the  asso- 
ciated regularization  filter  by  a  linear  shift-invariant  filter  in  a  wavelet  basis,  since 
the  modulation  matrix  in  our  preconditioner  gives  uniform  weighting  to  the  wavelet 
basis  functions  corresponding  to  the  same  frequency  band.  As  discussed  in  section 
5.2,  the  regularization  filter  is  spatially  varying  in  general,  especially  for  the  regu- 
larization problems  with  discontinuities  or  sparse  data  constraints.  However,  it  is 
possible  to  construct  a  spatially  varying  filter  in  a  wavelet  basis  to  achieve  a  better 
approximation  to  the  regularization  filter. 

The  other  possibility  to  design  an  adaptive  preconditioner  is  to  approximate  the 
generalized  inverse  of  (K0  +  UVT)  by  using  our  generalized  capacitance  matrix  algo- 
rithm or  the  generalized  Sherman-Morrison- Woodbury  formulas  [11,  34,  67].  Using 
this  approximation,  we  can  design  a  new  adaptive  preconditioner  that  accounts  for 
discontinuities  and  sparse  data  constraints  in  early  vision  problems.  This  will  be  the 
focus  of  our  future  research. 

The  important  issue  of  discontinuity  detection  has  not  been  discussed  in  the 
previous  chapters.    The  early  vision  problem  with  discontinuity  detection  can  be 
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formulate  in  a  deterministic  or  probabilistic  framework  (see  chapter  2),  leading  to 
a  coupled  (binary-real)  nonconvex  optimization  problem.  In  this  chapter,  we  will 
briefly  discuss  existing  techniques  for  this  problem  and  propose  a  novel  hybrid  search 
algorithm  as  a  possible  approach  toward  solving  this  problem.  This  hybrid  search 
algorithm  involves  a  combination  of  a  stochastic  and  a  deterministic  search  tech- 
niques. For  the  stochastic  search,  we  employ  a  novel  informed  genetic  algorithm 
(GA),  whereas  for  the  deterministic  part,  we  employ  either  the  generalized  capaci- 
tance matrix  algorithm  or  the  adaptive  preconditioned  conjugate  gradient  algorithm, 
presented  in  chapter  4  and  5,  respectively.  Some  promising  preliminary  experimental 
results  of  applying  this  hybrid  search  algorithm  on  surface  reconstruction  problems 
are  presented.  However,  this  hybrid  search  technique  will  have  to  be  considerably 
researched  upon  prior  to  deciding  upon  its  practicality. 

7.2     Previous  Energy  Minimization  Methods 

In  chapter  2,  we  have  seen  that  both  the  deterministic  formulation  and  proba- 
bilistic formulation  of  early  vision  problems  with  discontinuity  detection  give  rise  to 
minimizing  a  nonconvex  function.  Note  that  in  some  applications,  a  local  optimum 
may  provide  a  satisfactory  solution.  However,  in  certain  other  applications,  local 
optimal  solutions  are  unsatisfactory.  A  number  of  optimization  techniques  have  been 
proposed  to  solve  this  problem.  We  briefly  discuss  their  pros  and  cons  in  this  section. 

There  are  two  classes  of  optimization  methods  proposed  for  solving  this  non- 
convex  minimization  problems;  one  is  deterministic,  and  the  other  is  stochastic.  The 
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deterministic  methods  include  the  GNC  (graduated  non-convexity)  algorithm,  ana- 
log networks,  mean  field  annealing  and  dynamic  programming.  As  for  the  stochastic 
methods,  there  are  the  simulated  annealing,  MPM  (maximizer  of  posterior  marginals) 
and  the  GA  (genetic  algorithm). 

7.2.1     Deterministic  Optimization  Methods 

Most  of  the  deterministic  minimization  methods  can  not  find  the  global  min- 
imum solution  for  the  nonconvex  function  in  equation  2.6,  instead  they  lead  to  an 
approximate  solution  or  a  locally  optimal  solution.  But  they  can  converge  much 
faster  than  the  stochastic  methods.  The  following  deterministic  methods  have  been 
proposed  in  literature  for  this  nonconvex  minimization  problem.  We  give  a  brief 
description  of  the  same. 

GNC  algorithm 

The  GNC  (Graduated  NonConvex)  algorithm  was  proposed  by  Blake  and  Zisser- 
man  [8]  and  can  be  considered  as  a  continuation  method  widely  known  in  numerical 
analysis  literature  [14].  For  the  GNC  algorithm  to  minimize  the  nonconvex  func- 
tion E(u,h,v)  given  in  equation  2.6,  a  dual  energy  F(u)  is  derived  by  the  following 
equation 

F(u)  =  infE(u,h,v).  (7.1) 

h,v 

A  sequence  of  approximating  functions  F^p\u),  0  <  p  <  1,  are  constructed  such  that 
F^  is  convex  and  F^  is  equal  to  the  dual  energy  F.  Then,  gradient  decent  tech- 
niques are  employed  to  minimize  F^  for  a  prescribed  decreasing  sequence  of  values 
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of  p  from  1  to  0.  In  the  form  described  in  [8],  the  GNC  algorithm  is  applicable  only 
for  the  case  when  there  is  no  interaction  between  the  individual  line  process  variables, 
i.e.  no  smoothness  constraint  is  imposed  on  the  line  process.  This  choice  of  line  en- 
ergy is  purely  heuristic,  since  it  tries  to  penalize  only  the  inclusion  of  discontinuities. 
The  GNC  algorithm  proposed  in  [8]  can  only  deal  with  this  particular  form  of  line 
energy. 

Recently,  Bedini  et  al.  [4]  generalized  the  GNC  algorithm  to  allow  some  kind  of 
interactions  between  line  process  variables.  The  new  line  energy  contains  the  inhibi- 
tion of  parallelism  of  discontinuities  and  is  expressed  in  a  closed  form.  However,  it  is 
very  difficult  to  generalize  the  GNC  algorithm  to  include  arbitrary  line  energies,  i.e., 
general  constraints  on  the  line  process,  since  the  procedure  of  dual  energy  in  the  GNC 
algorithm  construction  to  eliminate  the  line  process  will  be  extremely  complicated  for 
general  cases.  In  addition,  the  GNC  algorithm  cannot  guarantee  to  find  the  globally 
minimal  solution  [47].  However,  as  stated  earlier,  it  does  yield  satisfactory  solutions 
in  a  number  of  applications. 

Analog  networks 

This  approach  is  very  promising  since  it  can  lead  to  a  real-time  implementation. 
The  idea  of  applying  analog  network  to  solve  this  nonconvex  minimization  problem 
is  in  spirit  similar  to  Hopfield  and  Tank's  neural  network  [36]  approach  for  solving 
combinatorial  optimization  problems.  The  original  energy  function  in  equation  2.6  is 
approximated  by  a  Lyapunov  function  such  that  the  associated  circuit  can  minimize 
this  Lyapunov  function  [44],  thus  leading  to  an  approximate  solution. 
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Analog  VLSI  circuit  based  solutions  were  developed  by  Harris  et  al.  [31]  for 
a  host  of  early  vision  problems.  These  analog  circuit  based  methods  lead  to  some 
impressive  practical  results.  However,  since  this  design  is  based  on  the  idea  of  line 
process  elimination  as  in  the  GNC  algorithm,  it  is  not  clear  as  to  how  the  network 
could  be  modified  to  include  general  line  energies. 

This  approach  does  permit  some  kinds  of  line  process  interactions  in  the  line 
energy,  but  the  line  energy  must  have  a  closed-form  expression.  In  general,  the  MRF 
model  doesn't  use  a  closed-form  expression  for  the  clique  potentials.  In  the  MRF 
model,  a  look-up  table  is  more  prevalent  for  designing  the  clique  potentials  for  the 
line  process. 

Mean  field  annealing 

The  mean  field  annealing  algorithm  [21]  uses  the  mean  field  approximation  and 
the  saddle  point  approximation  to  obtain  an  approximate  solution  to  the  MAP  esti- 
mation problem.  It  is  a  nice  deterministic  approximation  to  the  MRF  model  for  sim- 
ple clique  potential  functions.  Like  the  GNC  algorithm  and  neural  network  approach, 
mean  field  annealing  can  only  deal  with  specific  types  of  clique  potential.  When  the 
line  energy  includes  line  process  interactions,  i.e.,  the  neighborhood  system  for  line 
process  in  the  MRF  model  is  nonempty,  more  elaborate,  nontrivial  approximation 
methods  are  required  to  design  the  algorithm.  Consequently,  it  may  not  be  practical 
for  general  clique  potential  functions. 
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Dynamic  programming 

Dynamic  programming  is  a  global  optimization  algorithm  for  a  function  of  dis- 
crete variables  when  the  function  can  be  expressed  as  a  recurrence.  To  use  this 
algorithm,  the  real  variables  need  to  be  quantized  to  discrete  variables,  i.e.  taking 
values  from  a  finite  discrete  set.  In  [60],  the  recurrence  form  is  obtained  for  the  sim- 
plest line  energy  as  in  GNC,  i.e.  without  line  process  interactions  in  the  line  energy. 
This  is  a  degenerate  case  in  an  MRF  model,  since  it  happens  when  the  neighborhood 
system  for  line  process  is  empty.  It  is  nontrivial  to  obtain  a  recurrence  form  for 
general  clique  potential  functions. 

Although  dynamic  programming  can  achieve  a  global  optimum  for  the  dis- 
cretized  problems,  the  the  computational  cost  is  very  high  for  large  size  problems 
[8,  60],  thus  rendering  it  impractical. 

Nonlinear  diffusion 

Nonlinear  diffusion  [70,  64]  has  been  popularly  employed  as  an  energy  mini- 
mization tool  for  various  vision  problems  recently.  To  apply  this  technique  to  the 
coupled  nonconvex  minimization  problem,  Shah  [70]  changed  the  binary  line  process 
variables  to  the  edge  indicator  real  variables  which  assume  real  values  between  0  and 
1.  When  the  value  is  close  to  1,  it  implies  that  the  point  is  in  the  vicinity  of  an  edge, 
and  vice  versa.  Then,  a  coupled  partial  differential  equation  system  is  established 
from  this  modified  energy  functional.  The  solution  update  is  based  on  this  coupled 
PDE  system.  This  technique  seems  to  be  very  promising  in  terms  of  computational 
efficiency.    However  it  has  been  applied  only  to  the  simplest  line  energy  case,  i.e. 
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no  interaction  between  line  process  variables.    In  addition,  this  technique  can  not 
guarantee  the  globally  optimal  solution. 

All  the  above  deterministic  optimization  methods  can  only  be  applied  to 
some  restricted  energy  functions  whose  line  energies  contain  either  no  line  process 
interactions  or  very  simple  line  process  interactions.  The  line  energies  encode  the 
constraints  or  prior  information  we  impose  on  the  line  process.  Bedini  et  al.  [4]  used 
a  line  energy  with  simple  line  process  interactions  and  obtained  much  better  results 
than  those  with  the  simplest  line  energy,  i.e.,  no  interactions  between  line  processes. 
In  an  MRF  model,  the  line  energy  is  the  summation  of  line  clique  potentials.  When 
the  neighborhood  system  in  the  MRF  model  is  nonempty,  there  are  line  process 
interactions  in  the  line  energy.  In  general,  the  line  clique  potentials  are  accessed 
through  a  look-up  table,  and  no  closed-form  expression  is  available.  The  line  clique 
potentials  are  very  crucial  to  the  performance  of  the  results  [54].  They  are  regarded 
as  the  parameters  in  an  MRF  model  and  can  be  estimated  for  different  classes  of 
object  [54]  to  achieve  better  performance. 

Furthermore,  Most  of  the  above  mentioned  deterministic  methods  with  the  ex- 
ception of  dynamic  programming,  can  only  reach  an  approximate  solution  or  a  locally 
optimal  solution.  In  fact,  the  quantization  of  the  real  variables  in  dynamic  program- 
ming may  also  be  regarded  as  an  approximation. 
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7.2.2     Stochastic  Optimization  Methods 

Three  stochastic  methods  for  the  MAP  estimation  problems  will  be  discussed  in 
the  following.  Unlike  the  deterministic  optimization  methods,  there  is  no  restriction 
on  the  clique  potentials  for  the  stochastic  methods.  In  addition,  simulated  annealing 
as  well  as  the  genetic  algorithm  can  converge  to  a  global  optimum.  The  drawback 
of  stochastic  optimization  methods  is  that,  their  computational  cost  is  very  high, 
compared  to  most  of  the  deterministic  methods.  However,  we  will  present  a  novel 
hybrid  search  algorithm  that  has  yielded  promising  and  practically  feasible  prelimi- 
nary results. 

Simulated  annealing 

Simulated  annealing  has  been  known  as  a  general  global  optimization  algorithm. 
No  restriction  on  the  clique  potential  functions  need  be  imposed  in  this  approach. 
The  concept  of  this  algorithm  comes  from  statistical  physics  and  is  based  on  gradually 
lowering  a  parameter  T  that  corresponds  to  temperature  in  a  physical  system.  As 
T  decreases,  samples  from  the  posterior  distribution  approach  toward  the  minimal 
energy  solution.  Geman  and  Geman  [22]  developed  the  Gibbs  sampler  with  the 
simulated  annealing  algorithm  to  solve  the  MAP  estimation  problem.  In  order  to 
achieve  the  global  minimum,  the  cooling  schedule  for  the  temperature  must  be  very 
slowly  decreased  using  the  formula  T(k)  =  lo  n+k)'  wnere  T(k)  is  the  temperature 
during  the  k-th  iteration  and  c  is  a  constant.  Since  the  cooling  schedule  must  be 
very  slow  to  guarantee  global  convergence,  the  computational  cost  for  the  simulated 
annealing  algorithm  is  enormous. 
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Some  variants  of  the  simulated  annealing  algorithm  have  been  reported  in  liter- 
ature, e.g.  microcanonical  annealing  [13].  However,  their  computational  complexities 
are  similar  to  the  standard  annealing. 

MPM  method 

The  MPM  (maximizer  of  posterior  marginals)  method  was  proposed  by  Marro- 
quin  et  al.  [51].  It  is  an  approximation  to  MAP  estimation,  i.e.  the  MPM  method 
only  provides  an  approximate  solution.  Unlike  the  gradual  temperature  cooling  sched- 
ule in  the  simulated  annealing  algorithm,  the  MPM  method  executes  the  Metropolis 
algorithm  at  a  fixed  temperature  to  collect  statistics  about  the  equilibrium  behav- 
ior of  the  coupled  MRF.  Therefore,  this  method  requires  far  less  computation  than 
simulated  annealing. 

7.3     Proposed  Hybrid  Search  Algorithm 

In  this  section,  we  present  a  novel  hybrid  (stochastic  +  deterministic)  search 
algorithm  as  a  possible  solution  to  the  coupled  (binary-real)  nonconvex  optimization 
problem.  This  hybrid  search  algorithm  consists  of  an  informed  genetic  algorithm 
(GA)  and  either  one  of  our  generalized  capacitance  matrix  algorithm  or  the  adaptive 
preconditioned  conjugate  gradient  algorithm.  The  informed  genetic  algorithm  is  em- 
ployed as  a  global  minimizer  on  the  binary  line  process.  Inside  the  GA,  either  one  of 
the  aforementioned  deterministic  algorithms  is  used  on  the  real  (  surface  )  variables 
to  solve  the  quadratic  and  convex  minimization  problems  for  each  given  line  process 
configuration  in  the  GA. 
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As  shown  in  chapter  2,  either  the  deterministic  formulation  or  the  probabilistic 
formulation  leads  to  the  minimization  of  a  nonconvex  energy  function  £(u,l),  where 
1  =  [h  v],  given  in  equation  2.6  or  2.13.  This  energy  function  is  a  hybrid  of  the  real 
variables  and  boolean  variables.  We  know  that  the  GA  is  well  suited  for  combinato- 
rial optimization  problems  because  of  its  bit-string  representation.  This  nonconvex 
minimization  problem  can  be  transformed  to  a  combinatorial  optimization  problem 
as  follows. 

The  energy  function  E(u,  1)  can  be  rewritten  as 

£(u,l)    =    E(u\l)  +  U(\),  (7.2) 

£(n|l)    =    ^||Kdu-d||2  +  t/(u|l),  (7.3) 

u(\)  =  S>(i).  (7-4) 

c 

For  the  four-connected  neighborhood  system,  the  conditional  clique  potential  of  the 
real  variables  u  given  the  line  process  1,  U(u  |  1),  is  analogous  to  the  quadratic  mem- 
brane smoothness  energy,  i.e., 

U(u\l)    =    £Vc(u|l) 
c 

=    S((u<+i,j  -  "u,j)2(l  -  hij)  +  (utj+i  -  uu,j)2(l  -  viti)).      (7.5) 
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We  can  transform  the  minimization  of  E(u,  1)  with  respect  to  u,  1  to  the  mini- 
mization of  a  function  of  the  boolean  variables  1  only  as  follows. 

min£(u,l)    =    min{min£(u,l)j  (7.6) 

=    minjtf(l)  +  min£(u|l)j  (7.7) 

Define  the  function 

£'(!):=  t/(l)  +  min£(u|l)  (7.8) 
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Then  the  original  nonconvex  minimization  problem  is  transformed  to  the  combinato- 
rial optimization  problem,  i.e.  minimization  of  a  function  of  boolean  variables.  Note 
that  the  function  E(u  |  1)  is  a  quadratic  and  convex  function  of  u  when  the  operator 
in  the  observation  model,  Kj,  is  linear. 

Previously,  hybrid  algorithms  [52,  5]  were  proposed  to  use  simulated  annealing 
for  the  binary  line  process  and  analog  network  for  minimizing  the  quadratic  function 
E(u  |  1)  for  each  fixed  line  process.  In  our  hybrid  search  algorithm,  we  use  an  informed 
genetic  algorithm  to  minimize  this  new  energy  function.  To  compute  the  new  energy 
function  value  in  the  GA,  we  need  to  minimize  the  convex  and  quadratic  function 
E(u  |  1).  The  minimum  of  a  quadratic  and  convex  function  can  be  obtained  by 
solving  a  linear  system  with  an  associated  symmetric  positive-definite  (SPD)  matrix. 
Either  one  of  the  two  fast  algorithms  namely,  the  generalized  capacitance  matrix 
algorithm  and  the  adaptive  preconditioned  conjugate  gradient  algorithm,  presented 
in  this  thesis,  can  be  used  to  solve  the  SPD  linear  system. 
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The  overall  structure  of  our  algorithm  is  to  use  the  GA  as  a  global  minimizer 
to  the  function  £'(1)  with  a  fast  quadratic  convex  function  minimizer  inside  the  GA 
to  obtain  the  function  value  of  E'(\).  It  is  important  to  have  a  very  fast  quadratic 
convex  function  minimizer  since  we  need  to  compute  the  function  value  E'(\)  for 
each  new  line  process  candidate  solution  1  at  each  generation.  We  employ  either 
one  of  our  fast  algorithms  namely,  the  generalized  capacitance  matrix  algorithm  or 
the  adaptive  preconditioned  conjugate  gradient  algorithm,  to  solve  the  minimization 
problems  with  prescribed  discontinuities. 

7.3.1     Elements  of  Genetic  Algorithms 

The  GA  is  an  adaptive  search  algorithm  based  on  principles  derived  from  the 
dynamics  of  natural  population  genetics.  It  has  been  successfully  applied  to  combina- 
torial optimization,  function  optimization,  classifier  systems,  structure  optimization, 
pattern  recognition,  and  other  areas  [27]  [24]. 

A  population  of  candidate  solutions  is  generated  in  the  beginning  as  the  initial 
population,  denoted  by  P(0).  Then,  the  GA  operators,  i.e.  reproduction,  crossover 
and  mutation,  are  applied  to  the  current  population  P(t)  to  generate  the  descendant 
population  P(t  +  1).  Subsequently,  the  algorithm  is  an  iterative  adaptation  from 
one  generation  to  the  next  based  on  the  genetic  operators  and  the  fitness  function. 
The  GA  operators  are  designed  to  follow  the  survival  of  fittest  principle.  The  fitness 
function,  which  is  a  measure  of  fitness,  in  the  GA  is  directly  related  to  the  objective 
function  to  be  minimized  or  maximized  in  an  optimization  problem. 
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GA  can  be  directly  employed  to  solve  the  optimization  problem  by  denning  a 
positive  fitness  function  F(x)  in  terms  of  the  objective  function  E(x)  which  is  to 
be  optimized.  For  example,  we  can  choose  F(x)  =  E0  +  E(x)  for  the  maximization 
problem  and  F(x)  =  E0  —  E(x)  for  the  minimization  problem,  where  the  constant  EQ 
is  used  to  make  sure  the  fitness  F(x)  is  positive.  The  fitness  function  is  a  measure 
for  the  fitness  of  survival. 

A  representation  of  the  solution  space  has  to  be  chosen  for  the  GA  operators. 
Normally,  the  bit-string  representation  is  employed  for  simplicity.  To  use  the  bit- 
string  representation  for  the  real  variables  (surface),  we  have  to  discretize  the  contin- 
uous solution  space.  However,  this  representation  is  well  suited  for  boolean  variables, 
like  the  line  process. 

A  sketch  of  the  genetic  algorithm  is  as  follows  [69]: 

1.  Choose  an  initial  population. 

2.  Determine  the  fitness  function  value  for  each  individual. 

3.  Perform  selection  operation. 

4.  Perform  crossover  operation. 

5.  Perform  mutation  operation. 

6.  Check  stopping  criterion;  if  satisfied,  stop;  else  go  to  step  2. 

Genetic  operators 

In  this  section,  the  three  basic  genetic  operators  are  briefly  described. 
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Reproduction,  also  called  selection,  is  the  process  of  choosing  the  candidate  so- 
lutions for  the  next  generation  from  the  candidate  solutions  in  the  current  generation. 
Some  selection  schemes  have  been  proposed  [25],  e.g.  proportional  selection,  ranking 
selection,  tournament  selection  ,  ...,  etc. 

Crossover  selects  two  samples  from  the  current  population  as  "parents"  and  then 
mates  them  by  exchanging  a  part  of  their  "gene  values"  to  generate  their  "offsprings". 
The  "parents"  are  replaced  by  their  "offsprings"  in  the  current  population.  There 
are  many  possible  implementations  of  the  crossover  process.  For  example,  we  can 
randomly  select  two  samples,  a  start  position  and  an  end  position  in  the  gene  string, 
then  exchange  the  O's  and  l's  of  the  two  samples  between  the  two  positions.  An 
example  is  illustrated  in  figure  7.1. 

Parentl  =  OHOIOOOUOIOIOI 

Parent2  =  1101001010100110 

Childl  =  0110001010010101 

Childfl  =  1101100011100110 

Figure  7.1.  The  crossover  operation  example. 

The  mutation  operator  is  similar  to  the  perturbation  of  the  Gibbs  sampler  in 
the  simulated  annealing.  It  can  prevent  GAs  from  premature  convergence.  It  is  very 
easy  to  implement  this  operator.  Given  a  mutation  probability,  randomly  choose  a 
sample  and  a  position  and  flip  the  0  or  1  at  that  location  according  to  the  mutation 
probability. 
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Convergence  property 

Markov  chain  models  have  been  employed  to  analyze  the  convergence  of  simple 
genetic  algorithms  in  [15,  69,  73].  Davis  [15]  used  a  nonhomogeneous  finite  Markov 
chain  to  model  the  simple  GA  (population  size  is  fixed  but  the  mutation  and  crossover 
are  allowed  to  vary  with  the  iteration  index)  and  proved  that  the  convergence  rate 
of  a  simple  GA  is  far  superior  to  that  of  simulate  annealing.  Rudolph  [69]  employed 
a  homogeneous  finite  Markov  chain  to  model  a  canonical  GA  (  fixed  population  size, 
proportional  selection,  fixed  crossover  and  mutation  probabilities  ),  and  proved  that 
a  canonical  GA  never  converges  to  the  global  optimum  regardless  of  the  initialization. 
But,  he  also  showed  that  modified  versions  of  canonical  GAs  with  elitist  selection  ( 
the  best  individual  survives  with  probability  one  )  asymptotically  converge  to  the 
global  minimum  in  the  probabilistic  sense.  Following  the  Rudolph's  work,  Suzuki 
[73]  proved  that  the  probability  of  finding  the  global  minimum  in  n  generations  for  a 
canonical  GA  with  elitist  selection  is  lower  bounded  by  1  —  0(|A.|n),  where  |A»|  <  1 
and  A*  is  the  largest  eigenvalues  of  the  diagonal  sub-matrices  in  the  transition  matrix. 
In  addition,  an  upper  bound  for  |A*|  was  derived  in  terms  of  the  mutation  probability. 
Therefore,  one  can  find  the  mutation  probability  to  minimize  this  upper  bound  to 
achieve  best  convergence  performance  in  the  GA.  Because  of  the  advantage  of  elitist 
selection  proved  in  [69,  73],  this  strategy  is  included  in  our  informed  GA,  which  will 
be  discussed  in  the  next  section. 
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7.3.2     Informed  Genetic  Algorithm 

In  this  section,  we  present  an  informed  GA  which  incorporates  the  problem 
information  into  the  genetic  operators.  In  traditional  GAs  [24,  15,  69],  only  the 
reproduction  operator  depends  on  the  fitness  function,  while  the  crossover  and  mu- 
tation operators  randomly  explore  the  solution  space  with  prespecified  probabilities. 
Totally  random  search  is  a  very  inefficient  way  to  explore  the  solution  space,  and 
it  makes  the  GAs  converge  very  slowly  especially  when  the  entire  solution  space  is 
very  large,  such  as  early  vision  problems.  In  our  informed  GA,  we  use  an  informed 
mutation  operator  which  exploits  the  problem  information  in  the  search,  thus  making 
the  solution  search  in  the  GA  very  efficient. 

The  only  search  procedure  in  simulated  annealing  involves  the  use  of  the  Gibbs 
sampler  [22,  90]  which  is  a  random  perturbation  of  the  current  solution  and  is  similar 
to  the  mutation  operator  in  the  GA.  The  Gibbs  sampler  is  used  to  search  the  solution 
space  based  on  the  Gibbs  distribution  of  the  energy  function.  The  search  in  the 
Gibbs  sampler  makes  use  of  the  information  in  the  energy  function,  i.e.  the  problem, 
therefore  it  is  an  informed  search,  not  a  totally  random  search.  We  use  the  Gibbs 
sampler  as  an  informed  mutation  operator  in  our  informed  GA. 

Our  informed  GA  consists  of  the  reproduction  operator  and  an  informed  muta- 
tion operator  only.  The  main  idea  is  to  design  the  GA  operators  to  make  each  con- 
figuration of  candidate  solutions  in  the  generations  of  the  GA  to  be  a  sample  drawn 
from  the  Gibbs  distributions  of  the  energy  function  to  be  minimized.  To  achieve  this 
goal,  we  choose  our  initial  population  as  a  sample  from  a  Gibbs  distribution  of  the 
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energy  function  E'(\),  i.e., 


ftfl)    =    \e-W\  (7.9) 

z  =  Y,e~E'(l)/T'  (7-10) 

1 


where  T  is  a  constant.  Let  the  population  size  be  fixed  and  denoted  by  a  constant 
M.  By  using  a  Gibbs  sampler  to  obtain  M  samples  from  the  Gibbs  distribution 
given  in  equation  7.9,  we  obtain  an  initial  population  of  size  M,  denoted  by  the  set 

{1i,12,...,1m}- 

In  the  reproduction  operation,  we  use  the  Boltzmann  weighted  selection  [25,  16] 
scheme  such  that  the  survival  probability  for  the  i-th  individual  is  given  by 


PrOt)   =   jrte-E'™T,  (7-11) 

Z'    =    J2e-E'^T.  (7.12) 

.7  =  1 


For  the  informed  mutation  operator,  we  assign  the  probability  of  mutating  the 
i-th  individual  1,-  to  a  line  process  configuration  1  to  be  a  Gibbs  distribution  as  follows. 


Pm(h^i)    =    ^e-^-E'^/T  (7.13) 

Z"{\{)      =     £e-(£'(l)-*'(l.))/T  (7<14) 


170 


This  mutation  transition  probability  can  be  simplified  to  be 


Pm(li^l)  =  Ie-^I>/T.  (7.15) 


This  informed  mutation  operation  can  be  implemented  by  using  a  Gibbs  sampler 

[22,  90]. 

By  using  the  above  reproduction  and  mutation  operators  with  the  initial  popula- 
tion sampled  from  the  Gibbs  distribution  given  in  equation  7.9,  we  can  show  that  each 
individual  at  the  n-th  generation  is  a  sample  from  the  following  Gibbs  distribution 


Pn(\) 


jLe-(n+l)£'(l)/T5  (7-16) 


Z    =    £c-<»+i)iWr  (7.17) 

l 


Thus,  the  asymptotic  convergence  of  this  informed  GA  can  be  proved  directly  from 
equation  7.16  by  assuming  n  — ►  oo. 

Our  informed  GA  is  summarized  as  follows: 

1.  Choose  an  initial  population  from  a  Gibbs  distribution  in  equation  7.9. 

2.  Determine  the  fitness  function  value  for  each  individual. 

3.  Perform  Boltzmann  weighted  selection. 

4.  Perform  the  informed  mutation  operation  by  using  a  Gibbs  sampler. 

5.  Check  stopping  criterion;  if  satisfied,  stop;  else  go  to  step  2. 
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In  step  2,  we  need  to  compute  the  fitness  function  value  for  each  individual  line 
process  configuration.  To  compute  the  fitness  function  value  E'(l)  given  in  equation 
7.8,  it  involves  solving  a  convex  and  quadratic  minimization  problem.  This  can  be 
very  efficiently  solved  by  using  either  our  generalized  capacitance  matrix  algorithm 
or  our  adaptive  preconditioned  conjugate  gradient  algorithm,  presented  in  chapters 
4  and  5. 
7.3.3     Preliminary  Experimental  Results 

We  implemented  our  hybrid  search  algorithm  on  the  sparse  data  surface  recon- 
struction problem.  In  our  experiments,  our  hybrid  search  algorithm  consists  of  the 
informed  GA  for  the  binary  line  process  1  and  the  generalized  capacitance  matrix 
algorithm  for  determining  the  fitness  function  value  for  a  prespecified  line  process 
configuration.  The  sparse  data  set  (  see  figure  7.2(b)  )  is  obtained  from  sampling 
the  original  surface  shown  in  figure  7.2(a).  We  discretized  this  problem  on  a  32  x  32 
mesh.  The  sparse  data  set  shown  in  figure  7.2(b)  contains  64  data  points. 

In  our  experiments,  we  use  the  cliques  for  the  line  process  used  by  Geman  and 
Geman  [22],  shown  in  figure  7.3(a).  The  line  clique  potentials  for  different  line  process 
configurations  are  shown  in  figure  7.3(b).  This  choice  of  the  line  clique  potentials  is 
ad  hoc.  However,  they  can  be  obtained  via  parameter  estimation  in  MRF  models 
[54]. 

The  Gibbs  sampler  [22,  90]  is  used  to  implement  the  informed  mutation  operator 
in  our  informed  GA.  There  are  two  steps  in  the  Gibbs  sampler,  i.e.,  the  exploration 
step  (or  perturbation  step)  and  acceptance  step.    Each  time  after  the  exploration 
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(a) 


(b) 


Figure  7.2.  The  sparse  data  set  used  for  the  surface  reconstruction  with  discontinuity 
detection,  (a)  original  surface  (b)  sparsely  sampled  data  set. 

step,  we  need  to  compute  the  energy  function  value  for  the  perturbed  new  line  process 
configuration  and  then  determine  the  acceptance  probability  PaCcePt,  given  by 


*accept\\old       *  *new)  —   * 


1  when   E'(lnew)  <  E'(\M), 

e-(E'(\ncw)-E'(\old))IT     when     £,(lnew)  >  ^^ 


(7.18) 


where  \0u  is  the  line  process  before  the  exploration  step  and  \new  is  the  one  after  the 
exploration  step.  Computation  of  the  energy  function  value  £'(1)  given  1  involves 
minimizing  a  quadratic  convex  function,  which  is  solved  via  the  generalized  capac- 
itance matrix  algorithm.  It  is  necessary  to  repeat  these  two  steps  many  times  for 
the  Gibbs  sampler  to  converge,  therefore  the  computational  cost  is  extremely  high  in 
implementing  the  Gibbs  sampler.  In  our  experiments,  we  use  an  approximate  Gibbs 
sampler  to  reduce  the  computational  cost  drastically.  This  Gibbs  sampler  uses  the 
approximation  that  the  minimum  solution  u0/d  obtained  by  minimizing  the  function 
U(u\\0id)  is  similar  to  the  minimum  solution  unew  for  the  new  perturbed  line  process 
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Figure  7.3.   (a)  Clique  for  the  line  process  ( x :  line  process,  0:  nodal  variable  for 
surface),  (b)  clique  potentials  for  different  line  process  configurations. 
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\new.    By  using  this  approximation,  the  acceptance  probability  can  be  obtained  by 
computing  the  local  energy  change  in  E'  for  each  (  one-site  )  perturbation. 

In  our  experiments,  the  population  size  in  the  informed  GA  is  arbitrarily  chosen 
to  be  100.  By  using  our  hybrid  search  algorithm  on  the  sparse  data  set  shown  in 
figure  7.2(b),  the  converged  solution,  which  means  the  best  visited  solution  remains 
the  same  for  many  generations,  was  found  after  3  generations  of  the  informed  GA. 
Figure  7.4(a)  shows  the  reconstructed  surface  and  the  detected  discontinuities  is  given 
in  figure  7.4(b).  We  can  see  that  this  solution  is  very  close  to  the  true  solution. 


—\ 


(a) 


(b) 


Figure  7.4.  Surface  reconstruction  example:  (a)  the  reconstructed  surface  (b)  the 
recovered  discontinuity  map  (  solid  line  )  obtained  after  3  generations  of  the  informed 
GA  and  imposed  on  the  true  discontinuity  map  (  dashed  line  ). 


In  another  experiment,  we  added  noise  to  the  sparse  data  in  figure  7.2(b)  to  test 
our  hybrid  search  algorithm.  The  noise  for  each  data  point  i  is  a  uniform  random 
variable  ranging  in  the  interval  [— 0.2dt,  0.2d4],  where  d;  is  the  data  value  sampled 
from  the  original  shape  shown  in  figure  7.2(a).  Figure  7.5(a)  shows  the  noisy  sparse 
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data  set.  Our  hybrid  search  algorithm  found  the  converged  solution  after  5  gener- 
ations of  the  informed  GA.  The  reconstructed  surface  and  detected  discontinuities 
are  shown  in  figure  7.5(b)  &  (c),  respectively.  As  evident  from  this  experiment,  our 
hybrid  search  algorithm  has  obtained  a  very  accurate  solution  for  sparse  data  surface 
reconstruction  problems  with  high  noise. 


(a) 


(b) 


(c) 


Figure  7.5.  Surface  reconstruction  with  noise  example:  (a)  the  noisy  sparse  data  set, 
(b)  the  reconstructed  surface  (c)  the  recovered  discontinuity  map  (  solid  line  )  ob- 
tained after  5  generations  of  the  informed  GA  and  imposed  on  the  true  discontinuity 
map  (  dashed  line  ). 
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With  regards  to  computational  effort,  in  the  above  surface  reconstruction  exper- 
iments, our  generalized  capacitance  matrix  algorithm  usually  takes  less  than  1  second 
on  a  multi-user  SUN  SPARC-10  workstation  to  solve  a  convex  quadratic  minimiza- 
tion problem.  The  population  size  was  set  to  100  in  our  hybrid  search  algorithm.  In 
the  experiments,  three  generations  of  the  hybrid  search  algorithm  were  completed  in 
less  than  3  minutes.  For  5  generations  of  the  same,  it  took  less  than  6  minutes.  The 
structure  of  our  hybrid  search  algorithm  is  primarily  a  genetic  algorithm,  which  is 
easily  parallelizable  [71],  thus  making  it  straightforward  to  port  this  algorithm  to  a 
parallel  machine.  A  parallel  implementation  will  make  our  algorithm  a  very  useful 
and  efficient  technique  for  solving  hard  nonconvex  optimization  problems. 

7.3.4     Discussion 

A  hybrid  search  algorithm  is  presented  in  this  chapter  to  solve  the  coupled  ( 
binary-real)  nonconvex  optimization  problem.  Our  hybrid  search  algorithm  consists 
of  a  novel  informed  GA  and  either  one  of  our  generalized  capacitance  matrix  algo- 
rithm or  adaptive  preconditioned  conjugate  gradient  algorithm.  The  informed  GA 
is  used  as  a  stochastic  global  minimizer  for  a  new  energy  function  consisting  of  the 
binary  line  process  variables  only.  Inside  the  GA,  either  one  of  the  above  two  de- 
terministic algorithms  is  used  to  determine  the  new  energy  function  values  for  each 
line  process  configuration  visited  in  the  GA.  The  performance  of  this  hybrid  search 
algorithm  is  demonstrated  through  experiments  on  the  sparse  data  surface  recon- 
struction problem.  Since  the  GA  is  a  highly  parallizable  algorithm,  we  can  improve 
the  efficiency  of  this  hybrid  search  algorithm  by  porting  it  to  a  parallel  machine. 
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The  computational  cost  of  our  hybrid  search  algorithm  will  be  very  high  when 
the  problem  size  is  large.  One  possible  way  to  improve  the  efficiency  of  this  algorithm 
is  to  modify  the  informed  GA  as  a  multiscale  stochastic  optimization  method.  To 
achieve  this,  it  is  necessary  to  construct  a  multiscale  representation  for  the  line  process 
field.  Heitz  et  al.  [33]  constructed  a  multiscale  representation  for  a  label  field  and 
used  a  multiscale  relaxation  algorithm  to  minimize  the  energy  function  for  some  early 
vision  problems.  It  is  possible  to  generalize  this  construction  for  the  line  process 
field  and  apply  a  multiscale  informed  GA  to  make  the  hybrid  search  algorithm  more 
efficient. 


CHAPTER  8 
CONCLUSIONS 

This  thesis  reported  the  development  of  several  novel  and  efficient  computational 
algorithms  for  solving  early  vision  problems  using  either  a  regularization  formulation 
or  an  MRF  formulation. 

We  presented  the  generalized  capacitance  matrix  theorems  and  an  accompany- 
ing algorithm  for  solving  the  linear  systems  arising  from  the  discretization  of  the 
early  vision  problems  with  prespecified  discontinuities.  By  using  these  theorems,  we 
transformed  the  original  linear  system  arising  from  the  discretization  of  early  vision 
problems  into  a  Lyapunov  matrix  equation  or  a  cascaded  of  two  Lyapunov  matrix 
equations  with  an  appropriate  right-hand  side.  The  Lyapunov  matrix  equations  are 
solved  by  the  ADI  method,  which  was  proved  to  converge  for  a  prespecified  error 
tolerance  in  a  constant  number  of  iterations  with  each  iteration  taking  0(N)  oper- 
ations. The  right-hand  side  of  the  Lyapunov  matrix  equation  can  be  obtained  by 
solving  an  associate  capacitance  matrix  linear  system.  A  modified  BCG  algorithm 
was  used  to  solve  this  dense,  nonsymmetric  and  indefinite  linear  system.  This  al- 
gorithm was  implemented  and  tested  on  the  surface  reconstruction  and  shape  from 
shading  problems  to  demonstrate  its  efficiency. 

As  discussed  in  chapter  5,  the  generalized  capacitance  matrix  algorithm  is  not 
very  efficient  for  the  early  vision  problems  with  dense  data  and  nonuniform  weighting 
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case.  Hence,  we  presented  an  alternative  algorithm  called  the  adaptive  precondition- 
ing technique  as  a  general  solution  to  early  vision  problems  with  prespecified  disconti- 
nuities. This  adaptive  preconditioner  was  constructed  in  a  wavelet  basis  based  on  the 
approximation  to  the  spectral  characteristics  of  the  smoothness  and  data  constraints 
by  using  the  division  of  frequency  domain  property  in  a  wavelet  basis.  We  empirically 
showed  the  superiority  of  this  adaptive  preconditioner  over  other  preconditioned  pre- 
viously proposed  in  vision  literature  through  experiments  on  surface  reconstruction, 
shape  from  shading  and  optical  flow  computation. 

In  the  context  of  robustness,  we  developed  two  new  and  robust  algorithms  for 
accurate  optical  flow  estimation.  One  is  a  modified  gradient-based  algorithm  which 
combines  the  image  flow  constraint  and  the  contour-based  flow  constraint  into  the 
data  constraint  in  a  regularization  framework  to  amend  the  errors  in  the  image  flow 
constraint  caused  by  brightness  discontinuities.  To  solve  the  linear  system  arising 
from  the  regularization  formulation,  our  adaptive  preconditioned  conjugate  gradient 
algorithm  was  employed,  leading  to  an  efficient  algorithm.  The  other  is  an  SSD- 
based  regularization  method  that  uses  the  SSD  measure  as  the  data  constraint  in 
a  regularization  framework.  The  preconditioned  nonlinear  conjugate  gradient  with 
a  modified  search  direction  scheme  was  developed  to  minimize  the  resulting  energy 
function.  Experimental  results  for  these  two  algorithms  are  given  to  demonstrate 
their  performance. 

Finally,  in  future  research  directions,  we  presented  a  novel  hybrid  search  algo- 
rithm which  involved  a  combination  of  stochastic  and  deterministic  search  techniques. 
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We  proposed  an  informed  genetic  algorithm  for  the  stochastic  search,  while  using  ei- 
ther one  of  our  generalized  capacitance  matrix  algorithm  or  adaptive  preconditioned 
conjugate  gradient  algorithm  for  the  deterministic  search.  We  presented  some  promis- 
ing preliminary  results  to  demonstrate  the  effectiveness  of  this  algorithm  in  recovering 
shape  along  with  its  discontinuities  from  sparse  range  data.  However,  extensive  ex- 
perimentation as  well  as  further  theoretical  research  needs  to  be  performed  prior  to 
judging  the  practicality  of  this  algorithm  in  solving  early  vision  problems  leading  to 
nonconvex  minimization. 
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